HTML Syntax
DiggBlinkRedditDeliciousTechnorati
article by frankzzsword
What is HTML?
HTML documents are plain-text (also known as ASCII) files that can be created using any text editor (e.g., Emacs or vi on UNIX machines; SimpleText on a Macintosh; Notepad on a Windows machine). You can also use word-processing software if you remember to save your document as "text only with line breaks".
HTML Editors Your assignment calls for you to use a text editor like Notepad.
Tags Explained An element is a fundamental component of the structure of a text document. Some examples of elements are heads, tables, paragraphs, and lists. Think of it this way: you use HTML tags to mark the elements of a file for your browser. Elements can contain plain text, other elements, or both.
To denote the various elements in an HTML document, you use tags. HTML tags consist of a left angle bracket (<), a tag name, and a right angle bracket (>). Tags are usually paired (e.g., <H1> and </H1>) to start and end the tag instruction. The end tag looks just like the start tag except a slash (/) precedes the text within the brackets. HTML tags are listed below.
NOTE: HTML is not case sensitive. <title> is equivalent to <TITLE> or <TiTlE>.
The Minimal HTML Document Every HTML document should contain certain standard HTML tags. Each document consists of head and body text. The head contains the title, and the body contains the actual text that is made up of paragraphs, lists, and other elements. Browsers expect specific information because they are programmed according to HTML and SGML specifications.
Required elements are shown in this sample bare-bones document:
<html>
<head>
<TITLE>A Simple HTML Example</TITLE>
</head>
<body>
<H1>HTML is Easy To Learn</H1>
<P>Welcome to the world of HTML.
This is the first paragraph. While short it is
still a paragraph!</P>
<P>And this is the second paragraph.</P>
</body>
</html>
The required elements are the <html>, <head>, <title>, and <body> tags (and their corresponding end tags). Because you should include these tags in each file, you might want to create a template file with them. (Some browsers will format your HTML file correctly even if these tags are not included. But some browsers won't! So make sure to include them.)
HTML Markup Tags
Title
The title element contains your document title and identifies its content in a global context. The title is typically displayed in the title bar at the top of the browser window, but not inside the window itself. The title is also what is displayed on someone's hotlist or bookmark list, so choose something descriptive, unique, and relatively short. A title is also used to identify your page for search engines.
Body
he second--and largest--part of your HTML document is the body, which contains the content of your document (displayed within the text area of your browser window). The tags explained below are used within the body of your HTML document. Headings HTML has six levels of headings, numbered 1 through 6, with 1 being the largest. Headings are typically displayed in larger and/or bolder fonts than normal body text. The first heading in each document should be tagged <H1>. The syntax of the heading element is:
<Hn>Text of heading </Hn>
where n is a number between 1 and 6 specifying the level of the heading. Do not skip levels of headings in your document. For example, don't start with a level-one heading (<H1>) and then next use a level-three (<H3>) heading. ParagraphsUnlike documents in most word processors, carriage returns in HTML files aren't significant. In fact, any amount of whitespace -- including spaces, linefeeds, and carriage returns -- are automatically compressed into a single space when your HTML document is displayed in a browser. So you don't have to worry about how long your lines of text are. Word wrapping can occur at any point in your source file without affecting how the page will be displayed. In the bare-bones example shown in the Minimal HTML Document section, the first paragraph is coded as
<P>Welcome to the world of HTML.
This is the first paragraph.
While short it is
still a paragraph!</P>
In the source file there is a line break between the sentences. A Web browser ignores this line break and starts a new paragraph only when it encounters another <P> tag.
Important: You must indicate paragraphs with <P> elements. A browser ignores any indentations or blank lines in the source text. Without <P> elements, the document becomes one large paragraph. (One exception is text tagged as "preformatted," which is explained below.) For example, the following would produce identical output as the first bare-bones HTML example:
<H1>Level-one heading</H1>
<P>Welcome to the world of HTML. This is the
first paragraph. While short it is still a
paragraph! </P> <P>And this is the second paragraph.</P>
To preserve readability in HTML files, put headings on separate lines, use a blank line or two where it helps identify the start of a new section, and separate paragraphs with blank lines (in addition to the <P> tags). These extra spaces will help you when you edit your files (but your browser will ignore the extra spaces because it has its own set of rules on spacing that do not depend on the spaces you put in your source file). NOTE: The </P> closing tag may be omitted. This is because browsers understand that when they encounter a <P> tag, it means that the previous paragraph has ended. However, since HTML now allows certain attributes to be assigned to the <P> tag, it's generally a good idea to include it. Using the <P> and </P> as a paragraph container means that you can center a paragraph by including the ALIGN=alignment attribute in your source file.
<P ALIGN=CENTER>
This is a centered paragraph.
[See the formatted version below.]
</P>
This is a centered paragraph.
It is also possible to align a paragraph to the right instead, by including the ALIGN=RIGHT attribute. ALIGN=LEFT is the default alignment; if no ALIGN attribute is included, the paragraph will be left-aligned.
Lists
HTML supports unnumbered, numbered, and definition lists. You can nest lists too, but use this feature sparingly because too many nested items can get difficult to follow. To use lists go to A Beginner's Guide to HTML and read up on it.
Forced Line Breaks
The <BR> tag forces a line break with no extra (white) space between lines. Using <P> elements for short lines of text such as postal addresses results in unwanted additional white space. For example, with
: National Center for Supercomputing Applications<BR>
605 East Springfield Avenue<BR>
Champaign, Illinois 61820-5518<BR>
The output is:
National Center for Supercomputing Applications
605 East Springfield Avenue
Champaign, Illinois 61820-5518
Horizontal Rules
The <HR> tag produces a horizontal line the width of the browser window. A horizontal rule is useful to separate major sections of your document. You can vary a rule's size (thickness) and width (the percentage of the window covered by the rule). Experiment with the settings until you are satisfied with the presentation. For example: <HR SIZE=4 WIDTH="50%">
displays as:
Linking
The chief power of HTML comes from its ability to link text and/or an image to another document or section of a document. A browser highlights the identified text or image with color and/or underlines to indicate that it is a hypertext link (often shortened to hyperlink or just link). HTML's single hypertext-related tag is <A>, which stands for anchor. To include an anchor in your document:
1. start the anchor with <A (include a space after the A)
2. specify the document you're linking to by entering the parameter HREF="filename" followed by a closing right angle bracket (>)
3. enter the text that will serve as the hypertext link in the current document
4. enter the ending anchor tag: </A> (no space is needed before the end anchor tag)
Here is a sample hypertext reference in a file called US.html: <A HREF="MaineStats.html">Maine</A>
This entry makes the word Maine the hyperlink to the document MaineStats.html, which is in the same directory as the first document. Relative Pathnames Versus Absolute PathnamesYou can link to documents in other directories by specifying the relative path from the current document to the linked document. For example, a link to a file NYStats.html located in the subdirectory AtlanticStates would be: <A HREF="AtlanticStates/NYStats.html">New York</A>
These are called relative links because you are specifying the path to the linked file relative to the location of the current file. You can also use the absolute pathname (the complete URL) of the file, but relative links are more efficient in accessing a server. They also have the advantage of making your documents more "portable" -- for instance, you can create several web pages in a single folder on your local computer, using relative links to hyperlink one page to another, and then upload the entire folder of web pages to your web server. The pages on the server will then link to other pages on the server, and the copies on your hard drive will still point to the other pages stored there. It is important to point out that UNIX is a case-sensitive operating system where filenames are concerned, while DOS and the MacOS are not.
For instance, on a Macintosh, "DOCUMENT.HTML", "Document.HTML", and "document.html" are all the same name. If you make a relative hyperlink to "DOCUMENT.HTML", and the file is actually named "document.html", the link will still be valid. But if you upload all your pages to a UNIX web server, the link will no longer work. Be sure to check your filenames before uploading. Pathnames use the standard UNIX syntax. The UNIX syntax for the parent directory (the directory that contains the current directory) is "..". (For more information consult a beginning UNIX reference text such as Learning the UNIX Operating System from O'Reilly and Associates, Inc.)
If you were in the NYStats.html file and were referring to the original document US.html, your link would look like this: <A HREF="../US.html">United States</A>
In general, you should use relative links whenever possible because:
1. it's easier to move a group of documents to another location (because the relative path names will still be valid)
2. it's more efficient connecting to the server
3. there is less to type
However, use absolute pathnames when linking to documents that are not directly related. For example, consider a group of documents that comprise a user manual. Links within this group should be relative links. Links to other documents (perhaps a reference to related software) should use absolute pathnames instead. This way if you move the user manual to a different directory, none of the links would have to be updated.
Inline Images
Most Web browsers can display inline images (that is, images next to text) that are in X Bitmap (XBM), GIF, or JPEG format. Other image formats are also being incorporated into Web browsers [e.g., the Portable Network Graphic (PNG) format]. Each image takes additional time to download and slows down the initial display of a document. Carefully select your images and the number of images in a document. To include an inline image, enter: <IMG SRC=ImageName>
where ImageName is the URL of the image file. The syntax for <IMG SRC> URLs is identical to that used in an anchor HREF. If the image file is a GIF file, then the filename part of ImageName must end with .gif. Filenames of X Bitmap images must end with .xbm; JPEG image files must end with .jpg or .jpeg; and Portable Network Graphic files must end with .png.
Background Graphics
Newer versions of Web browsers can load an image and use it as a background when displaying a page. Some people like background images and some don't. In general, if you want to include a background, make sure your text can be read easily when displayed on top of the image. Background images can be a texture (linen finished paper, for example) or an image of an object (a logo possibly). You create the background image as you do any image. However you only have to create a small piece of the image. Using a feature called tiling, a browser takes the image and repeats it across and down to fill your browser window. In sum you generate one image, and the browser replicates it enough times to fill your window. This action is automatic when you use the background tag shown below. The tag to include a background image is included in the <BODY> statement as an attribute: <BODY BACKGROUND="filename.gif">
Background Color
By default browsers display text in black on a gray background. However, you can change both elements if you want. Some HTML authors select a background color and coordinate it with a change in the color of the text. Always preview changes like this to make sure your pages are readable. (For example, many people find red text on a black background difficult to read!) In general, try to avoid using high-contrast images or images that use the color of your text anywhere within the graphic. You change the color of text, links, visited links, and active links (links that are currently being clicked on) using further attributes of the <BODY> tag. For example: <BODY BGCOLOR="#000000" TEXT="#FFFFFF" LINK="#9690CC">
This creates a window with a black background (BGCOLOR), white text (TEXT), and silvery hyperlinks (LINK). The six-digit number and letter combinations represent colors by giving their RGB (red, green, blue) value. The six digits are actually three two-digit numbers in sequence, representing the amount of red, green, or blue as a hexadecimal value in the range 00-FF. For example, 000000 is black (no color at all), FF0000 is bright red, 0000FF is bright blue, and FFFFFF is white (fully saturated with all three colors). These number and letter combinations are generally rather cryptic. Fortunately an online resource is available to help you track down the combinations that map to specific colors and there is software available for you to do this on your workstation:
* VisiBone Online Color Lab for the Webmaster's Palette
For some basic colors -- typically those in the standard sixteen-color Windows 3.1 palette -- you can also use the name of the color instead of the corresponding RGB value. For example, "black", "red", "blue", and "cyan" are all valid for use in place of RGB values. However, while not all browsers will understand all color names, any browser that can display colors will understand RGB values, so use them whenever possible.
Mailto
You can make it easy for a reader to send electronic mail to a specific person or mail alias by including the mailto attribute in a hyperlink. The format is: <A HREF="mailto:emailinfo@host">Name</a>
For example, enter: <A HREF="mailto:pubs@ncsa.uiuc.edu">
NCSA Publications Group</a>
to create a mail window that is already configured to open a mail window for the NCSA Publications Group alias. (You, of course, will enter another mail address!)