Home Page Contents Back Web Node Structure Previous Up Next

Descriptions of how htm2txt and wenost work

Introduction

On this page are some descriptions of the operation of htm2txt, wenost and ppwizard. While these can be used in the sequence described, first extracting data from existing web pages, then using that to build a website with a new look and navigation, it is not necessary to do so.

In the following descriptions some simplifications have been made as an aid to understanding the basic operation of the programs.

htm2txt

As illustrated below, htm2txt extracts from each .html file (page.html) that it processes:

page.html   page.head
Page Title
[body text]

<html>
<head>
<title>Page Title</title>
[other head info]
</head>
<body>
[body text]
</body>
</html>
--> [other head info]
  page.body
--> [body text]
  structure.txt
--> .
.
.
page       Page Title

Having extracted data into the structure.txt file it is up to the user to then edit that file to place the pages listed in the desired order and to use indentation to indicate the tree structure of the website.

wenost

As illustrated below, for each line that wenost processes from the structure file (structure.txt) it produces the following (assuming the line is for "page"):

page.it
Contains navigation data as PPWizard macros, PPWizard includes for two PPWizard header files (common.ih and <$PageSource>.ih). It is up to the user to define the "PageSource" PPWizard macro in the common.ih header file and to place a PPWizard include for page.txt in <$PageSource>.ih. As an aid to upgrading from wenost version 0.90 this illustration also has the previously required "HeaderSource" and "FooterSource" defined in common.ih. Appart from these requirements it is up to the user to decide what they want to put in the four header files common.ih, <$PageSource>.ih, header.ih and footer.ih.
page.txt
On first running wenost on a structure file a skeleton page content file (page.txt) is produced. This can be based on a template (template.txt) or otherwise is based on a simple default. On subsequent invocations of wenost (i.e. if page.txt exists) the page content file is updated. If used the content of the template file is up to the user. In the illustration below it contains a PPWizard include of page.body.
page.crumbs.txt
The file page.crumbs.txt is created with a list of pages from the top node down to the current page. The list is interspersed with macros (bc_start, bc_item, bc_end, bc_colour and bc_ruoloc) that must be defined in common.ih.
page.links.txt
If the page has child nodes then page.links.txt is created with a list of links to the child pages. The list is interspersed with macros (list_start, list_item, list_end, list_link and LinksPath) that must be defined in common.ih.
page.head
If the file page.head does not exist then wenost outputs it with some PPWizard comments that describe the file, including CVS Id and Log tags, plus meta data describing the page and giving keywords. The page.head file is not illustrated in the figure below.
structure.txt   page.it
index      Home Page
    .
    .
    page       Page Title
        nod1       Node 1
        nod2       Node 2
    .
        .
        .
        .
    .
    .
--> #define [navigation data]
#define ThisPage page
#define ThisTitle Page Title

#include common.ih

#include <$PageSource>.ih
  page.crumbs.txt
--> <$bc_start>
  <a href="index.<$html>"><$bc_colour>Home Page<$bc_ruoloc></a>
  <$bc_item> Page Title
<$bc_end>
  page.links.txt
--> <$list_start>
<$list_link NAME="nod1" TITLE="Node 1" PAGE="nod1" PATH="<$LinksPath>nod1">
<$list_item>
<$list_link NAME="nod2" TITLE="Node 2" PAGE="nod2" PATH="<$LinksPath>nod1">
<$list_end>
  page.txt
--> <table width="95%" border="1"><tr><td>
#include <$ThisPage>.body
</td></tr></table>
<hr>
<p>See:
#include <$ThisPage>.links.txt
template.txt  
<table width="95%" border="1" align="center"><tr><td>
#include <$ThisPage>.body
</td></tr></table>
-->

PPWizard

Following on from the illustrations above, when a wenost generated file (such as page.it) is processed by PPWizard an html file is produced that includes common.ih and <$PageSource>.ih. In its turn <$PageSource>.ih could include header.ih, page.txt and footer.ih. In their turns header.ih could include page.head and page.crumbs.txt, and page.txt could include page.body and page.links.txt. This is illustrated below.
page.it   page.html
#define [navigation data]
#define ThisPage page
#define ThisTitle Page Title

#include common.ih

#include <$PageSource>.ih
--> <head>
<title>Example Site - Page Title</title>
[other head info]
</head>
<body>
<a href="index.html">Home Page</a> &gt; Page Title
[navigation bar using navigation data defined in page.it]
<table width="95%" border="1" align="center"><tr><td>
[body text]
</td></tr></table>
<hr>
<p>See:
<ul>
<li><a href="nod1.html">Node 1</a>
<li><a href="nod2.html">Node 2</a>
</ul>
<hr>
[navigation links using navigation data defined in page.it]
This page last updated [date of page.txt via PPWizard macro]
</body>
</html>



Example Site - Page Title
Home Page > Page Title
[navigation bar using nav data defined in page.it]
[body text]

See:

  • Node 1
  • Node 2

[navigation links using nav data defined in page.it]
This page last updated [date of page.txt]
  common.ih  
#define SiteName Example Site
#define HeaderSource header
#define FooterSource footer
#define PageSource page-source

#define list_start <ul><li>
#define list_item <li>
#define list_end </ul>
#define list_link \
<a href="{$PATH}.html">{$TITLE}</a>
#define LinksPath

#define bc_start
#define bc_item &gt;
#define bc_end
#define bc_colour
#define bc_ruoloc
-->
page-source.ih  
#include <$RootPath><$HeaderSource>.ih
#include <$RootPath><$ThisPath>.txt
#include <$RootPath><$FooterSource>.ih
-->
header.ih  
<head>
<title><$SiteName> - <$ThisTitle></title>
#include <$ThisPage>.head
</head>
<body>
#include <$ThisPage>.crumbs.txt
[navigation bar using navigation data defined in page.it]
-->
  page.head  
  [other head info] -->
  page.crumbs.txt  
  <$bc_start>
  <a href="index.<$html>"><$bc_colour>Home Page<$bc_ruoloc></a>
  <$bc_item> Page Title
<$bc_end>
-->
page.txt  
<table width="95%" border="1" align="center"><tr><td>
#include <$ThisPage>.body
</td></tr></table>
<hr>
<p>See:
#include <$ThisPage>.links.txt
-->
  page.body  
  [body text] -->
  page.links.txt  
  <$list_start>
<$list_link NAME="nod1" TITLE="Node 1" PAGE="nod1" PATH="<$LinksPath>nod1">
<$list_item>
<$list_link NAME="nod2" TITLE="Node 2" PAGE="nod2" PATH="<$LinksPath>nod1">
<$list_end>
-->
footer.ih  
<hr>
[navigation links using navigation data defined in page.it]
This page last updated [date of page.txt via PPWizard macro]
</body>
</html>
-->

Note: Unless you want to preserve the look of pages from an existing site while adding extra navigation features or other adornments around them it probably is not worthwhile to have the template file (template.txt) include a page .body file (page.body). Also unless you want each page to have individual <head> items, such as page specific meta tags, it is not necessary to have the header file (header.ih) include a .head file (page.head).


Previous: How to Use wenost and htm2txt - Up: Home Page - Next: The CVS repository
Top: Home Page - Contents - Back
Valid HTML 4.01! Hosted by SourceForge
This page last updated: 02/03/2003 at 2:05:55am
Please send any comments on this page to Dr Michael Baker.
Valid CSS! This page made with PPWizard