Introduction to the Internet

A Talk to be given September 23, 1995 to the Houston Geological Society

Copyright 1995 - Alan K. Jackson


Talk Outline

In preparing for this talk I thought it would be helpful to talk about where the internet came from. Not being completely certain myself, I decided I had better educate myself. So I cruised the internet for about an hour, and came away with around 100 pages of history, timelines, and statistics.

What is the Internet?

Timeline

In 1957, an event that one of my history Profs called the most significant event of the latter half of this century occurred.

The Soviet Union launched Sputnik.

In the hysteria that followed, in fear that we had fallen behind in technology, a project was started in the RAND corporation to figure out how to build a command and control network that was able to survive a nuclear attack. After much study, the conclusion was reached that the network would have to operate with no central authority, and would have to be designed to work even if large portions of it were missing.

This work was the origin of packet-switching, which is the basic technology used on the internet today. Note that it was important that there be *no central authority*.

2. Arpanet In 1969, the first real network built using packet-switching was put into service. It was funded by the Defense Department Advanced Research Projects Agency (DARPA), and called ARPANET. There were four nodes in the network, UCLA, Stanford, UCSB, and the University of Utah.

The original intent of the network was to allow researchers to share each others computing facilities and move data back and forth. Quickly, however, actual usage of the network became dominated by a completely unforeseen phenomenon, electronic mail.

Soon the email use was broadened into electronic conferencing, where mail messages were automatically forwarded to a list of subscribers, and online electronic meetings began to take place. This was the origin of the Listserver. The first really popular list was SF-Lovers, a list for science fiction fans.

3. Growth Throughout the 1970's the network grew as more machines and institutions joined. In the mid-seventies, TCP/IP was developed and replaced the earlier software that had been used. The growth was possible for two reasons. First of all, since the network had no central machine running things, there was no central bottleneck to prevent new machines from joining the network. Secondly, the communication protocol allowed any brand of computer to communicate with any other brand. The barriers to joining up were very low.

4. 1980's In 1979, USENET was started between Duke and the University of North Carolina. Internet would never be the same. USENET is a public bulletin board where people can post what they will, and anyone with internet access can see it. In 1993 there were about 2.5 million people reading USENET each month.

Why do people like USENET? If you write a letter to the editor of the Chronicle on some topic you feel strongly about, they may or may not publish it, and even if they do, how many people are likely to read it? A few thousand? But if you post to a USENET group, you are guaranteed that you will be published, and depending on what group it is, you could be read by thousands to tens of thousands of people, all over the world. A little scary perhaps, but also very exciting.

There was a small committee that tried to control the content and organization of the internet. But in 1987, they tried to prevent the formation of a group to discuss recreational drugs, which caused one of the committee members to rebel and start the infamous alt newsgroups, which no one controls. The first alt groups were alt.drugs, alt.sex, and for purely aesthetic purposes, alt.rock-n-roll. alt.gourmand was also added, by the rebellious committee member. It was quickly realized that, due to the basic architecture of the internet, that attempts at control were futile. To control the internet, every system manager at every node must agree.

5. The 1990's - Archie, Gopher, and *The WEB*

Folks had been using ftp for some time. FTP just allows you to transfer a file from one machine to another, and many organizations, especially governmental and educational organizations, had set up anonymous ftp servers, which would allow anyone to log into their system and download data files or software. Even the U.S. Army got into the act with the famous SIMTEL PC-software archive.

In 1990, McGill University released Archie, which is a query engine that will search ftp sites. How can you go get something if you don't know where it is? A system, called Gopher, was developed next (1991 at the University of Minnesota) - it was a way to easily present menus of items available, and transfer them to a client. And in 1992, the University of Nevada released Veronica, which is a search engine for Gopher.

Let me give you an example of what Veronica can do. About a year ago I began keeping sourdough starter. (By the way, there is a USENET newsgroup devoted to sourdough) One evening I decided that I would like to make sourdough pancakes the next morning, but I didn't have a recipe. So I logged on, and told Veronica "search for sourdough and pancake". About thirty seconds later she came back with about a dozen hits. I downloaded the one from Kent, England, and made delicious pancakes.

But the big news in 1992 came from Switzerland, the great European Particle Physics laboratory, CERN, released the first version of the World-Wide Web. The internet would never be the same again. This summer, Web traffic became the dominant traffic on the internet, with about 24,000 GBytes transferred per month.
Growth plot

b. Who runs the internet and how does someone get connected?

Nobody runs the internet. New machines connect to the internet by establishing a phone connection to any other machine on the net. No one regulates who connects, and each machine pays it's own way. Individuals connect by going through a service provider - a University, a bulletin board, an online service (Prodigy, America Online), or an Internet Service Provider (ISP).

II. What are the components of the internet?

BBS, Online Services, and the Internet

a. BBS's, Online Services, and the internet

A bulletin Board is a privately owned, local system that allows dialup by subscribers. Often it is a guy and a PC, with a fairly minimal investment. Bulletin Boards generally have a "theme" to attract users, there are Christian bulletin boards, Society-based bulletin boards, Pornographic bulletin boards, vendor bulletin boards, whatever. Some boards now have internet connections, and provide one way to access the internet. Some allow their clients to log in from the internet. But they are, basically, separate from the internet, and not a part of it.

Online services are Bulletin Boards with a gland condition. The "Big Three" are Prodigy, Compuserve, and America Online. Each service has over a million subscribers, and have local phone connections nationwide. Compuserve is available internationally. They all provide a plethora of local services; chat rooms, software, stock quotes, airline reservations, games, etc. They also all provide internet access; mail, newsgroups, ftp, and Web.

The internet, remember, is not owned by anyone. It is just the outcome of interconnecting some 40,000 different public and private networks world-wide in a standard, unrestricted fashion.

Mail, News, the Web

I look at the internet as having five primary components. There is email, where it all began. The closest analogue to email I can think of is the telephone. Email is point-to-point. I mail a message to you. Of course I can mail one message to five "you's", which is definitely an advantage over the telephone. And I never get put on hold or play telephone tag, but the analogy is still pretty close.

Mailing lists and news are similar components. They are like standing around the water cooler or the coffee machine and having impromptu discussions. Topics come up, the conversation wanders around the topic, and it winds down. Both lists and newsgroups are set up to address specific topics or groups, and each one develops a particular personality over time. They are wonderful forums for meeting people from around the world, and for learning about the topics at hand.

FTP is like using a library. You go check out a document, or software, or data from a library system. Except that since what you get is a copy, you don't have to give it back! Many universities and government agencies set up FTP sites to make their data and documents available to the public. Software vendors do a similar thing. Patches and updates, documentation, free software and examples are often available from vendor's sites.

The WEB is also like a library, except that publishing has just become a lot easier. Anyone with access to a web server, which will soon include all the online service subscribers, can publish their own stuff as web pages.

Reliability and Expectations

You should have no expectations of reliability or accuracy for items on the internet. There is no quality control on the net, except what individuals do on their own. Remember that anyone can post anything on the net. Yes, there are viruses in some software you can download, yes people post things to newsgroups that are wrong or even dangerous. Use your judgment. If it sounds too good to be true, it probably is. If it sounds incredible, it probably isn't true.

On the other hand, there is some wonderful stuff out there. If you learn what the reliable sites are, you can get lots of goodies for free. Government sites and vendor sites are usually quite good, and some university sites as well. But do be careful.

Inda gave me a wonderful metaphor for the net, the grocery-store bulletin board - you know, that bulletin board near the door where all the yardmen and jack-of-all-trades post their business cards? The store takes no responsibility for the quality of the work you may receive if you hire one of them, they just provide space for them to advertise. It is very much caveat emptor. And the internet is no different. It is a space for people to put stuff, to advertise, to share things. Just be careful.

Security - data, information, and money.

Is the internet secure? I want to send my daily drilling reports from a tite hole into the office on the internet. Is that okay?

Sure, if you don't care who reads them!

Messages get passed from machine to machine along the internet to get to their destination. A single message may make five or ten hops as it travels across country. At each hop, it gets put onto someone's disk drive.

Think of internet messages like you think of postcards. It's wide open, and anyone who happens upon it can read it.

If you really want to send valuable information on the internet, data, information, or things like credit card numbers, use encryption. There are some good encryption schemes, that even the government can't crack. Be aware, though, that there are legal impediments to using them.

III. What can I do on the internet? Real-life examples

Email - collaboration and communication

I have been using email in various forms for about 15 years now. I've been on the internet for about five. What good is email? Well, here are some examples.

I ordered some software once from a company in Tasmania. Up to the point where I faxed the order form, we communicated via email, which was free.

I have sent in bug reports to vendors by cutting and pasting the error message into an email note, and sending it along.

Recently, an old roommate was coming in from out of town. We negotiated what we would do, where he would stay, etc. over email. No long-distance charges.

Routinely I mail messages to everyone at Pecten. No trees killed, no waiting for the mail service, delivery is instantaneous.

b. mailing lists - discuss common interests

Mailing lists - these are formed for a specific purpose. Some are serious, some are frivolous. I am currently on mailing lists devoted to :

Anglicanism, Bread Machines, Carpal Tunnel, Arc/INFO software, The Indigo Girls, Geostatistics, News from Venezuela, Weather bulletins, State Department Travel bulletins, Gravity and Magnetics, Geophysical Workstations, and many more.

The Arc/INFO list in particular - is a very active list. There are probably 20 or 30 messages every day on the list. Typically someone will send in a message like :

I'm looking for a way to find the distance from the centroids of a polygon coverage to a line. The 'near' command is almost perfect, except that it will only work on a point coverage, not on a polygon cover.

usually within hours, several answers are received. Many times the answers are accompanied with code and examples. The original poster will then summarize the answers and post that to the list so that everyone can benefit.

For Pecten's use, I receive State Department travel bulletins by mail and make them available to our travelers. Similarly, I receive local weather forecasts and all the tropical storm bulletins and post them for Pecten users.

c. data - retrieve data via ftp

I have done a LOT of FTP. Both for data and software. For example, I have downloaded :

- Arc/INFO coverages of all the roads, lakes, rivers, aquifers, and county boundaries for the Texas coast. This is 75 Mbytes worth of data, and comes from the State of Texas server.

- Digital elevation models for various quad sheets in the US and a copy of the CIA World DataBank II country outline data from the Xerox server.

- DOS software for geochemistry, geology, mapping, and other items from the COGS server in Denver.

- Climate and elevation data from the National Center for Atmospheric Research in Boulder.

- Roget's thesaurus, the CIA World Book of Facts, and Alice In Wonderland from the Gutenburg server in Illinois.

- Every month I go retrieve a current table of ISO country names and codes from a server in Germany. This has been a useful source of standard names and spellings of countries - they seem to change so frequently.

Plus a whole lot more.

d. Software - freeware and shareware

There are several software sites around the world. In England there is a site devoted to indexing other sites for Mac software. I have downloaded software for Macs, for DOS, and for Unix from all over the world. Many times, folks will put software out there for you to try, and if you like it, you are morally obligated to send them some money to keep using it. This is called 'shareware'.

There is also a lot of software out there that is free. Some of it is very good. There is a group of very bright, committed programmers who sincerely believe that software should be free, and that programmers should only be paid for maintaining it, not for owning it. This is the philosophy behind the Gnu project, and the Free Software Foundation. You can agree or disagree with their philosophical position, but I can say that their software is excellent. Much better than a lot of stuff I have paid for, and it is free. I whole heartedly recommend all the Gnu software, and the Perl freeware. Perl isn't part of the Gnu package, but has the same copyright treatment.

All of this stuff is available free by ftp. On my PC at home I have most of my favorite Unix utilities, acquired for free.

e. Information - gleaned from lists, web pages, newsgroups, etc.

Here is a sampling of things I have found.

I needed to prepare for this talk. So I surfed the internet for a few hours and found all this (show notebook).

About two years ago I learned about the Wietek CPU-chip upgrade for Sun workstations from a USENET discussion group. Our inhouse IT organization wasn't pursuing it at all, but based on information I saw on the newsgroup, I pursued it, and it turned out to be a really good thing. We would not have even known about it if I didn't read the newsgroups.

I have learned about vendors and data sources from newsgroups and webpages. I find the web particularly valuable for finding vendors in areas that I'm not very familiar with. Someone needed some software for doing translations from English to Spanish. So I cruised the WEB for a little while, and quickly got a sense of what was available, and about how much it would cost.

What is out there?

IV. What about pornography? Is the internet dangerous?

a. The Rimm report and Time magazine

Earlier this year, Marty Rimm, an Electrical Engineering undergraduate at Carnegie Mellon published a study of pornography on the internet, in the Georgetown Law Review.

Obviously a completely objective, balanced treatment.

What Mr. Rimm did, was look at the part of the USENET that is devoted to primarily carrying image data, 32 newsgroups out of the 17,000 that exist, decide that 17 of these were pornographic, and count the images in those 17. Within this restricted subset, 87% of the images were deemed pornographic.

Note that it is very easy to restrict access to a few newsgroups. At Shell, for example, our postmaster simply doesn't allow most of the alt newsgroups (where these live) to ever come into our system.

Mr. Rimm actually spent the majority of his effort studying the content of private, adult BBS's. This is where the 917,000 pornographic images came from. These are bulletin boards that people subscribe to, are given a phone number and a password, and then pay for images they download. This is not the internet.

Time Magazine, in their July 3 issue, reported on the Rimm report. In their article they consistently failed to distinguish clearly between adult Bulletin Boards and the internet, and misquoted several of the statistics, or presented them in a misleading fashion.

Finally, Senator Grassley on the floor of the Senate, commenting on the Rimm report said, "The university surveyed 900,000 computer images. Of these 900,000 images, 83.5 percent of all computerized photographs available on the Internet are pornographic. Mr. President, I want to repeat that: 83.5 percent of the 900,000 images reviewed--these are all on the Internet-- are pornographic, according to the Carnegie Mellon study."

His comments are, of course, complete fallacious, but laws were proposed to fix the problem!

b. Replies to the Rimm report

A number of experts commented on the Rimm report.

Brian Reid, a research manager for DEC, noted that the numbers that Rimm quoted were mislabeled as to what they actually mean. Rimm had confused images being available with images being downloaded. And Brian should know, Rimm used Brian's software to gather the statistics.

Other experts noted that the overall methodology had severe problems from a statistical standpoint, and also noted that what was being measured was not always clearly noted.

c. My take on it

Is there pornography on the internet? Sure. Is the internet a place where, to quote the Rimm report, "pornography permeates the digital landscape"?

I've been internet cruising for about 5 years now. In all that time I have run into one dirty picture. Some idiot posted a dirty picture to the wrong newsgroup. It was compressed and encoded, so it took some knowledge and effort to unpack it and even determine what it was. When I saw what it was, I sent a copy to the fellow's postmaster. He lost his account the next day.

No, pornography does not permeate the digital landscape, any more than it permeates I45 coming back from the airport!

Would I let my child cruise the internet?

Today, no. But he is only eight years old. The internet is not a toy, it is not an entertainment vehicle. It is a communication tool, designed originally for computer knowledgable adults. No one has ever intended it to be a place for children to safely play. I wouldn't give my son the keys to my car, either.

That said, when he is a year or two older, I will let him on, but I will keep tabs on what he does and who he talks with. You know whose house your kid is playing at, you should know where he is cruising the internet.

There are also products available that will lock out most of the pornography. As Rimm noted, most of it is confined to a few special places, so locking it out from someone is not too difficult.

V. Summary

The things I want to most remember from what I have said are the following :