Summary: This laboratory exercise introduces a general framework for documents which you routinely find on the World-Wide Web. In addition to the general structure of html documents, the lab reviews various html tags, including simple formatting, links, and images.
For this lab, it will be convenient for you to be able to move easily between viewing two different web pages: this one, and one you will modify. This can be done by opening multiple "tabs" in your browser as follows:
If all went well, this should have opened a second "tab" in your browser. (Near the top of the window, there should be icons that look like index-tabs. You can click on either of them to switch between viewing multiple pages you have open.)
For this lab, you will use and modify a copy of the document found here:
www.cs.grinnell.edu/~walker/fluency-book/labs/sample-html-page.html
To get started, follow these steps (but you may want to read all the steps before you begin):
As you will see, this page contains the first two paragraphs of Chapter 1 of Walker, Henry M., The Tao of Computing: A Down-to-earth Approach to Computer Fluency, Jones and Bartlett, 2005.
Once you can view the sample document in your browser, select Page
Source from the View menu (or select
View Page Source from the menu obtained by right-clicking
anywhere on the web page). This will show you the original document
sample-html-page.html, as sent by the server to your
browser (and before the browser has done its formatting).
As you will see in the new source window, a typical html Web page has the following basic format:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
...
</head>
<body>
...
</body>
</html>
According to the standards of the World
Wide Web Consortium (W3C), the first line of the document must
indicate the type of document and the specific version of html being
used. In this case the document type (DOCTYPE) and the
version follows the international standard for
HTML 4.01
Transitional. This opening line informs your browser how to
interpret the rest of the page.
In the lines that follow, formatting commands (also
called tags) are given in angle
brackets: < > . For
example, <head> marks the beginning of header
information for the document, and </head> indicates
the end of the same section. Similarly, <body>
marks the beginning of the main body of the page, continuing through
</body> near the end of the file. Some tags are for
markup that has no beginning and end, such as drawing a horizontal
line, <hr >.
The following table gives some common formatting commands, some of which are illustrated in this example:
| Command | Meaning |
|---|---|
<html> | begin an HTML document |
<head> | begin the header section |
<title> | begin a title |
<body> | begin the body of the document |
<h1> | begin a header1 section
(headers can be h1, h2, h3, h4) |
<center> | center the material that follows |
<p> | begin a new paragraph |
<br> | break a line (begin a new line) |
<strong> | begin strong text (typically bold type face) |
<em> | begin emphasized text (typicaly italics type face) |
<u> | begin underlining text |
<hr /> | draw a horizontal line |
<blockquote> | indent the section, as in a quotation |
While these formatting directives explain many of the elements in
sample-html-page.html, the fourth line in the sample file
requires some additional comment. In html, a meta
instruction is used to supply information about the page itself to
the browser. In this case, this full line indicates that characters
on this page are encoded according to the ISO-8859-1 or
"Latin-1" character set that is suitable for most Western European
languages. (In contrast, "ISO-8859-5" would indicate a Cyrillic
alphabet, "SHIFT_JIS" would specify a Japanese encoding, and
"EUC-JP" another Japanese encoding. Other choices also are
possible.)
For more information about HTML, you might try the primer A Beginner's guide to HTML, a very old but still accurate guide written by Marc Andreessen, co-author of the first web browser and founder of Netscape. There are also more in-depth tutorials
sample-html-page.html, to find the title tag. When
viewing the sample web page through your browser, can you find where the
page's "title" appears?
The process for "publishing" an HTML document (i.e., making it available for others to view on the World Wide Web) includes three basic components:
Unfortunately, the details for each of these steps vary considerably from one computer system to another. The directions below will help you through the process on the MathLAN. These instructions are typical of computer systems using the Linux operating system.
For Web access, a Linux Web server looks for files within a directory
called public_html, located in a user's home directory.
To be accessible by anyone over the World Wide Web, you must give
everyone permission to navigate through both your home directory and
the public_html directory, in order to access your html
files.
In the Linux world, the relevant permissions (for these directories) are encoded as the number 755, or the permisions string
drwxr-xr-xThe number can be interpreted as follows.
111, which corresponds to the
permissions rwx.
101, which is why the permissions
appear as r-x.
The following steps establish the needed directory and permissions.
public_html with the following command (except you should
replace
username with your own username). Note that the
character between "public" and "html" must be an underscore.
mkdir /home/username/public_html
public_html
subdirectory. Since Web users may be anywhere in the world, this
means that you must allow everyone this access. You can use the two
commands that follow to do this:
chmod 711 /home/username chmod 755 /home/username/public_html
While you could create a file for Web viewing in many ways, a simple way is to begin with an existing Web document and then to edit it. Follow these steps to copy the sample page to your account and make it accessible to the World-Wide Web.
public_html directory:
cp ~walker/public_html/fluency-book/labs/sample-html-page.html public_html/
cd public_html
644 indicates that the owner of
the file can read and modify it, while everyone else can only
read it. We are not giving anyone permission to "execute" the
file.)
chmod 644 sample-html-page.html
username with your
MathLAN username. In addition, the tilde character is
necessary; however, since the file is on a Linux system, the Web
server automatically looks in your public_html
directory, so you do not need to include that directory name in the
URL.)
http://www.cs.grinnell.edu/~username/sample-html-page.htmlIf you have any trouble viewing the sample web page at this point, please ask for help.
gedit.
gedit use the following command:
gedit sample-html-page.html &(The ampersand at the end of the command means that the terminal window will still be available for you to type further commands in, should you need to.)
<strong> and </strong>.
</p>
and <p>.
<em> and
</em>.
(Note that when you use multiple tags like this, they should
not be interleaved. In other words, it is wrong to say
<em><strong> followed
by </em></strong>. Rather, they should be
nested like matching layers of an onion. Even if the web page
displays correctly for you when you do this, it might not
display correctly for someone using a different browser.)
<u> and
</u>.
<h1> and
</h1>.
<h1> This is my
new title</h1>. Reload the page to see what
happens.
<h1> header near the top of the page
to
<h1>, <h2>,
or <h4>. In each case, describe what happens.
<h1> to <h1>
to <h2> to
<h4> ?
<hr> tag (for "horizontal
rule") just before the end of the "body" of the web page. There is no
closing tag that goes with <hr>.
Within Web pages, it is common to include links to other documents. Indeed, this is what "hypertext" is all about. The simplest way to specify a link is through the use of an anchor tag.
sample-html-page.html :
<a href="http://www.grinnell.edu">a link to the Grinnell College web page</a>
Don't forget to save your file and reload it into your browser.
In your browser, follow the link you just created. (Of course, you know the back button will return you to the page you are working on. Did you know that the keystroke combination alt+← will do the same thing?)
To review how an anchor works, the opening tag begins with the letter
a, and the end of the anchor has the standard
format </a>. Within the opening tag, the
reference href specifies the URL that will be linked to.
The words between the open and close tags (in this case
"a link to the Grinnell College web page") become the label for the
link.
sample-html-page.html
<img src="http://www.cs.grinnell.edu/~weinman/gfx/weinman-photo.jpg" alt="Jerod Weinman">Upon reloading the page in the browser, you will encounter a picture of your instructor.
Note that this image tag does not have a corresponding closing tag.
Within the img tag. This is indicated by the /
before the closing bracket. The src attribute identifies
the file name and location for the source of the image; the
alt attribute specifies what text the browser should display
if the image is unavailable. Depending on the browser you are using, if
you move the mouse over the image, the
alternate text may also be displayed. It is good to always
provide alternate text: some users may have set their browsers to
not display images; other users may have special software that reads
web pages to them, and it will read the alternate text if any is
provided.
Here we also remind you of a detail about how web pages work. By
adding the img command given above to your web page,
you have simply instructed the web browser of a visitor to your web
page that you'd like a particular image (accessible by a URL) to be
shown at a particular place in your page. The user's web browser
then fetches that image from wherever it lives elsewhere on the
web. Thus, you have not literally copied the source image, though
your visitor has. That visitor's copying is a necessary artifact of
the way the web is designed, and it's an idea the courts have
grappled with minimally.
This practice is known as "deep linking," and it was the subject of a 1999 lawsuit between Ticketmaster and tickets.com, which eventually found the practice legal because it did not involve direct copying. (You can read more about the case at Wikipedia.)
Although legal, strictly speaking, there are no doubt ethical issues involving honesty in representation and usage of the resources belonging to others (i.e., web hosts), often supported by advertising revenue that may be circumvented. As it turns out, there are technical ways to prevent such "remote" deep linking, by inspecting the source of the request.
Our best practice will be to use images you have permission to deep link, or for which you have permission to copy to your own web host and share via the web.
Remember that it is appropriate to resize images that you intend to display on your web page, so that they are not too large (and so that they do not take too long to download when others view your page).
You can also specify the exact dimensions at which the image will be displayed using html tags, as shown below. Just be aware that this approach, used alone, does nothing to speed up the download. It only changes what the viewer sees, not the size of the actual image file.
height="100"anywhere within the
img tag, and review what happens in the browser.
height specification and insert instead
width="150"Again, review what happens in the browser.
height="200"
105. Check to see that you do with
the command:
ls -l ~/105If you don't have that directory and the images in it, you can quickly do the steps in the "Preparation" for the previous lab.
public_html directory with
the following command. (Note that the asterisk indicates that all
files in the 105 directory that end with ".jpg" should be
copied.)
cp ~/105/*.jpg ~/public_html/
chmod 644 ~/public_html/*.jpg
src attribute by replacing the
address of the current image with that of the new image. For
example:
<img src="http://www.cs.grinnell.edu/~username/water-lilies.jpg">You will also want to modify the
alt attribute of
the image tag appropriately. (Also, don't forget to
change username to your own user name.)
If you have gotten this far and have additional time, surf to the demo web page that we created in class. View its html source code to review how lists are created in html, and then create a list on your sample web page.
The list you have made, if you followed the example on the demo
Web page from class, is called an "un-ordered" list. To modify your
list to become an "ordered" list, change the
<ul> and </ul> tags that delimit
the list to:
<ol> and </ol>. Try
this to see how it affects the appearance of your list.
<tr> and </tr> useful for
this.
<td> and </td>.