Discussion:
[Tidy-dev] White lines in Netcape 6
Frank Visser
2002-03-22 13:23:11 UTC
Permalink
Hi,

This may be a known problem, but I just stumbled on it.

I have tidied a couple of sites which display well in IE 5.x and Netscape
4.x, but in Netscape 6 there are white horizontal lines around images.

What option in Tidy should I turn off to avoid this?

I thought it was indent, as the documentation says it is better to avoid the
setting "yes" and use "auto" instead, but that's exactly what I used, and I
still get the white lines.

what exactly is the problem here? I thought it was a line break between <img
/> and </td>, but even <img /></td> does not seem to work.

thanks for any help, in particular, can I now fix the pages with Tidy as to
this browser bug?

frank

HUMAN-i
Euro RSCG Interaction
Frank Visser
Project Manager
Snipweg 3
1118 DN Schiphol
The Netherlands
T +31 (0)20 456 53 87
F +31 (0)20 456 51 00
E ***@human-i.com
W www.human-i.com
Karl Ove Hufthammer
2002-03-22 13:28:09 UTC
Permalink
Post by Frank Visser
This may be a known problem, but I just stumbled on it.
I have tidied a couple of sites which display well in IE
5.x and Netscape 4.x, but in Netscape 6 there are white
horizontal lines around images.
Please see:
<URL: http://developer.netscape.com/evangelism/docs/articles/img-table/ >
--
Karl Ove Hufthammer
Frank Visser
2002-03-23 12:07:04 UTC
Permalink
Karl,

Thanks a lot for this tip. That's exactly what i needed.

The key phrase seems to be (for me): "If your document uses transitional
markup, make sure your DOCTYPE reflects that fact and does not have a URI".

Does that mean I can/should use the transitional doctype declaration:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/transitional.dtd">
<html xmlns="http://www.w3.org/TR/xhtml1">

... but i have to delete the
"http://www.w3.org/TR/xhtml1/DTD/transitional.dtd" part?

I tried, but the images still have white gaps under them.

Can you specify the doctype I need for me?

thanks in advance,

frank
Message: 3
Subject: Re: [Tidy-dev] White lines in Netcape 6
From: Karl Ove Hufthammer <***@bigfoot.com>
To: tidy-***@lists.sourceforge.net
Date: Fri, 22 Mar 2002 16:27:08 +0100
Post by Frank Visser
This may be a known problem, but I just stumbled on it.
I have tidied a couple of sites which display well in IE
5.x and Netscape 4.x, but in Netscape 6 there are white
horizontal lines around images.
Please see:
<URL: http://developer.netscape.com/evangelism/docs/articles/img-table/
--
Karl Ove Hufthammer


--__--__--

Message: 4
Date: Fri, 22 Mar 2002 11:14:10 -0500
To: ***@interaccess.com
From: Charles Reitzel <***@rcn.com>
Subject: Re: [Tidy-dev] Tidy for html-xml parser and embedded C++.
Cc: tidy-***@lists.sourceforge.net

Hi Thaddeus,

First, if I read you right, you are asking for a library version of HTML

Tidy. A couple folks have forged ahead with Tidy libraries. See
http://www.lemburg.com/files/python/mxTidy.html and
http://www.dysfunctionals.org/~lee/TidyCPP.zip, also
http://perso.wanadoo.fr/ablavier/TidyCOM/. Any of these will lag
somewhat
behind the current version. For example, I have used TidyCOM with VB
successfully to do bulk Word-To-HTML conversions.

Otherwise, what you want to do _is_ doable, just not easily in C. In a
shell script or, better, Perl it is not a problem. Simply use existing
Tidy options to send all the errors to a file, output to another file
and,
perhaps, various informational messages to the standard output. The
non-informational messages (warnings and errors) are easily parsed with
a
regular expression or even C strstr().

Also, if you _documenting_ C code, you might try placing C source code
within either the <pre> or <code> tags.

Hope this helps and send along any follow up questions you may have.

thanks,
Charlie
Post by Frank Visser
be better off sending it to this group.
One addition to what I've written. I'm doing this on Linux.
First.
Some one I suggested that I send my query to this mailing list.
I haven't been able to find any way to subscribe to this mailing list,
so please either send me the answer directly or show me how
to subscribe.
My problem.
I've written software which crawls through web pages ie given
a web page, I find all the links ( and all the images ) on that web
page. ( The purpose of this is that I get a lot of manuals books etc.
as a tar gzipped set of html documents [eg the Python documentation ].
I then install these on my local web server [ accessible only from my
Lan of which I am the only user ]. I download stuff faster than I can
add a link, so the crawler finds all the files and adds links to files
( I try to be top down--make a best guess of what the index pages are
). Then I find all the links on the links etc.
The main problem I have is parsing the page to find the links.
At first I tried using regular expressions, and it mostly worked.
1) Fragile and there seemed to be multiple expections to the rules
that kept growing.
2) Slow.
So then I used expat to parse the files.
Which was fine for the xml files, but didn't work
for the html files ( of course).
The solution to this was: if expat choked on the file, then change
tidy -asxhtml -m $filename.
Unfortunately tidy chokes on some of the files.
Very few and it looks worthwhile to go on a case by case basis.
The biggest offenders seem to be web pages that contain embedded
C++. For example: vector<T>. Tidy interprets this as a tag <T>.
1) Instead of calling tidy via a system call,
I would like to take the tidy source, remove main and write a
char *tidy(char *buffer,char *error);
Where buffer is the to be parsed file, error is a buffer containing
error messages and tidy returns a xhtml version of the buffer.
2) If this tidy function encounters an error, I would like some way of
being told what character in the buffer the error firsts occurs
memcpy(tidy_buffer,original_buffer, sizeof(file));
tidy(tidy_buffer);
while((int char_pos=error_is_bad_tag())!-0)
{
fix_tag_a_pos(&original_buffer, char_pos)
memcpy(tidy_buffer,original_buffer, sizeof(file));
tidy(tidy_buffer);
}
_______________________________________________
Tidy-develop mailing list
https://lists.sourceforge.net/lists/listinfo/tidy-develop
--__--__--

Message: 5
Date: Fri, 22 Mar 2002 17:33:23 +0100
To: ***@interaccess.com
From: Lee Goddard <***@LeeGoddard.com>
Subject: Re: [Tidy-dev] Tidy for html-xml parser and embedded C++.
Post by Frank Visser
The main problem I have is parsing the page to find the links.
Your best bet is to use Perl; as it was designed for this there are
modules
for exactly this job. For example:

NAME
HTML::LinkExtor - Extract links from an HTML document

SYNOPSIS
require HTML::LinkExtor;
$p = HTML::LinkExtor->new(\&cb, "http://www.perl.org/");
sub cb {
my($tag, %links) = @_;
print "$tag @{[%links]}\n";
}
$p->parse_file("index.html");

DESCRIPTION
*HTML::LinkExtor* is an HTML parser that extracts links from an
HTML
document. The *HTML::LinkExtor* is a subclass of *HTML::Parser*.
This
means that the document should be given to the parser by calling
the
$p->parse() or $p->parse_file() methods.

$p = HTML::LinkExtor->new([$callback[, $base]])
The constructor takes two optional arguments. The first is a
reference to a callback routine. It will be called as links are
found. If a callback is not provided, then links are just
accumulated internally and can be retrieved by calling the
$p->links() method.

The $base argument is an optional base URL used to absolutize
all
URLs found. You need to have the *URI* module installed if you
provide $base.

The callback is called with the lowercase tag name as first
argument, and then all link attributes as separate key/value
pairs.
All non-link attributes are removed.

$p->links
Returns a list of all links found in the document. The returned
values will be anonymous arrays with the follwing elements:

[$tag, $attr => $url1, $attr2 => $url2,...]

The $p->links method will also truncate the internal link list.
This
means that if the method is called twice without any parsing
between
them the second call will return an empty list.

Also note that $p->links will always be empty if a callback
routine
was provided when the *HTML::LinkExtor* was created.

EXAMPLE
This is an example showing how you can extract links from a
document
received using LWP:

use LWP::UserAgent;
use HTML::LinkExtor;
use URI::URL;

$url = "http://www.perl.org/"; # for instance
$ua = LWP::UserAgent->new;

# Set up a callback that collect image links
my @imgs = ();
sub callback {
my($tag, %attr) = @_;
return if $tag ne 'img'; # we only look closer at <img ...>
push(@imgs, values %attr);
}

# Make the parser. Unfortunately, we don't know the base yet
# (it might be diffent from $url)
$p = HTML::LinkExtor->new(\&callback);

# Request document and parse it as it arrives
$res = $ua->request(HTTP::Request->new(GET => $url),
sub {$p->parse($_[0])});

# Expand all image URLs to absolute ones
my $base = $res->base;
@imgs = map { $_ = url($_, $base)->abs; } @imgs;

# Print them out
print join("\n", @imgs), "\n";

SEE ALSO
the HTML::Parser manpage, the HTML::Tagset manpage, the LWP
manpage, the
URI::URL manpage

COPYRIGHT
Copyright 1996-2001 Gisle Aas.

This library is free software; you can redistribute it and/or
modify it
under the same terms as Perl itself.


Tool completed successfully

hth
lee




--__--__--

_______________________________________________
Tidy-develop mailing list
Tidy-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tidy-develop


End of Tidy-develop Digest
Karl Ove Hufthammer
2002-03-23 12:23:01 UTC
Permalink
Post by Frank Visser
Thanks a lot for this tip. That's exactly what i needed.
The key phrase seems to be (for me): "If your document uses
transitional markup, make sure your DOCTYPE reflects that
fact and does not have a URI".
Does that mean I can/should use the transitional doctype
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
[...]

No. XHTML documents are *always* rendered in standards mode
(i.e. correctly). If you want to use quirks (bugwards
compatible) mode, you have to use HTML 4.01 (or earlier), *not*
XHTML.
Post by Frank Visser
I tried, but the images still have white gaps under them.
Can you specify the doctype I need for me?
Use
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

and ensure your document is normal HTML 4.01, not XHTML. But
IMO a better solution is to specify style rules so that your
document is displayed the way you want, even if 'standards
mode' rendering is used.
--
Karl Ove Hufthammer
Frank Visser
2002-03-23 17:48:01 UTC
Permalink
Hi Karl,

Thanks again for clearing up the confusion. Forgive me for continuing this
discussion, but i have spent today going over the NS6 issue with
tables/images/spacers, and now it dawns on me that this issue has not begun
with XHTML 1.0 at all, but with HTML 4.0 - I wasn't aware of this when i
started off this thread.

For HTML 4.0 the advice is given:
- either declare use of a transitional DTD, but do not include a URL (Meyer
on Netscape's own DevEdge),
- or don't use a DOCTYPE at all (Eric Meijer on O'Reilly site)

It all boils down to a Netscape 6 "bug" (it even has a number: #42525, see
http://bugzilla.mozilla.org/) or a "purist interpretation of CSS2"
(according to jaylard at evolt.org), but the upshot is: i cannot use XHTML
(transitional) code simply because Netscape 6 refuses to recognize it as
such (i.e. mixed XHTML and presentational tags), and insists on rendering it
as strict code!?

Where is the logic behind all this? It simply does not make sense to me.
Isn't the essence of a "transitional" page that it does NOT conform strictly
to the rules of future coding?

Now for XHTML you seem to say, leaving out the URL does not help, for XHTML
is always renderend standard (i.e. strict), instead of quirky (i.e. loose).

Instead I should just stick to HTML 4.0 coding, and leave it at that?

What about leaving out the DOCTYPE declaration at all, for the moment (as
Meyer seems to advise)? That seems to fix my problems, and I can still have
clean XHTML on my pages. Does that do any harm, even as it may serve no
purpose either?

The reason I took the trouble to upgrade my pages to XHTML is that it
teaches my programmers to code more rigorously, and the content of the pages
can be processed in XML based publishing systems. I assume I can still
continue down that road?

Best,

frank

-----Original Message-----
From: Karl Ove Hufthammer
To: tidy-***@lists.sourceforge.net
Sent: 03/23/2002 03:21 PM
Subject: Re: [Tidy-dev] White lines in Netcape 6
Post by Frank Visser
Thanks a lot for this tip. That's exactly what i needed.
The key phrase seems to be (for me): "If your document uses
transitional markup, make sure your DOCTYPE reflects that
fact and does not have a URI".
Does that mean I can/should use the transitional doctype
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
[...]

No. XHTML documents are *always* rendered in standards mode
(i.e. correctly). If you want to use quirks (bugwards
compatible) mode, you have to use HTML 4.01 (or earlier), *not*
XHTML.
Post by Frank Visser
I tried, but the images still have white gaps under them.
Can you specify the doctype I need for me?
Use
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

and ensure your document is normal HTML 4.01, not XHTML. But
IMO a better solution is to specify style rules so that your
document is displayed the way you want, even if 'standards
mode' rendering is used.
--
Karl Ove Hufthammer
Karl Ove Hufthammer
2002-03-23 19:55:01 UTC
Permalink
Post by Frank Visser
Thanks again for clearing up the confusion. Forgive me for
continuing this discussion, but i have spent today going
over the NS6 issue with tables/images/spacers, and now it
dawns on me that this issue has not begun with XHTML 1.0 at
all, but with HTML 4.0
That's correct.
Post by Frank Visser
- either declare use of a transitional DTD, but do not
include a URL (Meyer on Netscape's own DevEdge),
Yes.
Post by Frank Visser
- or don't use a DOCTYPE at all (Eric Meijer on O'Reilly
site)
In this case, your document would be HTML at all, but
so-called 'tagsoup'.
Post by Frank Visser
It all boils down to a Netscape 6 "bug" (it even has a
number: #42525, see http://bugzilla.mozilla.org/
No, this bug was fixed almost one and a half year ago, and the
bug report is closed.

) or a
Post by Frank Visser
"purist interpretation of CSS2" (according to jaylard at
evolt.org), but the upshot is: i cannot use XHTML
(transitional)
Yes, you can, *if* you write in a way the makes it display like
you want in Netscape. This is not impossible!
Post by Frank Visser
code simply because Netscape 6 refuses to
recognize it as such
No, Netscape correctly parses and presents it as a (X)HTML
Transitional document.
Post by Frank Visser
(i.e. mixed XHTML and presentational
tags), and insists on rendering it as strict code!?
No, it renders it as a Transitional document, but it renders it
in 'standard mode', i.e. correctly as by the CSS specs. The
names 'Strict' and 'Transitional' are only names given to
various 'versions' of HTML 4.01 and XHTML 1.0.
Post by Frank Visser
Where is the logic behind all this?
The logic is this:

There are billions of pages with broken CSS (and HTML) out
there. These were rendered incorrectly by most older browsers,
but they were rendered *the way the Web designers wanted* (the
designers naturally assumed the browsers rendered their
document correctly). If newer browsers suddenly started to
render documents in standards-compliant mode, all these
billions of pages would break.

One solution would be to *never* support the CSS 2 standard.
Obviously a bad solution! Another solution is to only render
documents in standard-compliant mode if they are using a full
DOCTYPE declaration. The assumption is that the authors
who write pages with such a DOCTYPE, knows the various Web
standards, wishes to follow them, and wants his/her documents
to be rendered correctly (a very reasonable assumption, I might
add).
Post by Frank Visser
It simply does not make
sense to me. Isn't the essence of a "transitional" page
that it does NOT conform strictly to the rules of future
coding?
No.
Post by Frank Visser
Now for XHTML you seem to say, leaving out the URL does not
help, for XHTML is always renderend standard (i.e. strict),
instead of quirky (i.e. loose).
Yes. If you write XHTML documents, you obviously want to follow
the standards, and then there's no need for buggy CSS
rendering.
Post by Frank Visser
Instead I should just stick to HTML 4.0 coding, and leave
it at that?
Yes. Or change your style sheet so that the document is
displayed *even* in standards compatible mode (luckily, it will
probably still be rendered the way you want it to in older
browsers).
Post by Frank Visser
What about leaving out the DOCTYPE declaration at all, for
the moment (as Meyer seems to advise)? That seems to fix my
problems, and I can still have clean XHTML on my pages.
No, a DOCTYPE declaration is *required* for all XHTML (and HTML
4.x) documents. I advise you to run you documents through
<URL: http://validator.w3.org/ >.

Also note that other browsers, such as IE 6, uses the DOCTYPE
to choose if standards-compliant or 'compatible' rendering
should be used. You can find more information on 'DOCTYPE
switching' at <URL:
http://gutfeldt.ch/matthias/articles/doctypeswitch.html >.
--
Karl Ove Hufthammer
Charles Reitzel
2002-03-24 15:48:09 UTC
Permalink
There is another, much simpler explanation: the CSS standards are
broken. Perhaps CSS should be adjusted to comply with existing
implementations - especially insofar as they are interoperable.

<flame suit on>

Don't get me wrong. Stylesheets are great. I have been a major promoter
of both CSS and XSLT stylesheets. I don't believe in standards getting too
far ahead of actual implementations, that's all. For example, there are no
full CSS2 implementations yet after 3 years. Does anyone have a good feel
for what might be an inter-operable subset of CSS2? Are there folks
successfully using CSS2 to re-purpose content to multiple media types:
desktop, print, PDA? IMHO, CSS1 has only just arrived.

take it easy,
Charlie
Post by Karl Ove Hufthammer
Post by Frank Visser
Where is the logic behind all this?
There are billions of pages with broken CSS (and HTML) out
there. These were rendered incorrectly by most older browsers,
but they were rendered *the way the Web designers wanted* (the
designers naturally assumed the browsers rendered their
document correctly). If newer browsers suddenly started to
render documents in standards-compliant mode, all these
billions of pages would break.
One solution would be to *never* support the CSS 2 standard.
Obviously a bad solution! Another solution is to only render
documents in standard-compliant mode if they are using a full
DOCTYPE declaration. The assumption is that the authors
who write pages with such a DOCTYPE, knows the various Web
standards, wishes to follow them, and wants his/her documents
to be rendered correctly (a very reasonable assumption, I might
add).
Dave Raggett
2002-03-25 10:03:03 UTC
Permalink
Post by Charles Reitzel
There is another, much simpler explanation: the CSS standards are
broken. Perhaps CSS should be adjusted to comply with existing
implementations - especially insofar as they are interoperable.
<flame suit on>
Don't get me wrong. Stylesheets are great. I have been a major promoter
of both CSS and XSLT stylesheets. I don't believe in standards getting too
far ahead of actual implementations, that's all. For example, there are no
full CSS2 implementations yet after 3 years. Does anyone have a good feel
for what might be an inter-operable subset of CSS2? Are there folks
desktop, print, PDA? IMHO, CSS1 has only just arrived.
FYI - W3C is considering an update to the CSS2 spec to apply the current
requirements for multiple implementations of W3C specs. The W3C Process
has evolved to add a Candidate Recommendation (CR) phase since CSS2 was
done. The update to CSS2 will be followed by a modularized version of
CSS, named CSS3, which of course will have to go through the CR process.
For more information contact the W3C style activity lead, Bert Bos
<***@w3.org>
--
Dave Raggett <***@openwave.com> or <***@w3.org>
W3C Visiting Fellow, see http://www.w3.org/People/Raggett
tel/fax: +44 1225 866240 (or 867351) +44 771 213 7629 (GSM)
Frank Visser
2002-03-25 07:04:06 UTC
Permalink
karl,

but shouldn't people who use Tidy be warned explicitly that (1) you cannot
just tidy your code with it but (2) you should also add stylesheet fixes for
images if you don't want to run into trouble with Netscape 6 (and IE 6)?

So one has to go all the way, and can't just stop somewhere in between?

frank



HUMAN-i
Euro RSCG Interaction
Frank Visser
Project Manager
Snipweg 3
1118 DN Schiphol
The Netherlands
T +31 (0)20 456 53 87
F +31 (0)20 456 51 00
E ***@human-i.com
W www.human-i.com

-----Original Message-----
From: Karl Ove Hufthammer [mailto:***@bigfoot.com]
Sent: Saturday, March 23, 2002 10:54 PM
To: tidy-***@lists.sourceforge.net
Subject: Re: [Tidy-dev] White lines in Netcape 6
Post by Frank Visser
Thanks again for clearing up the confusion. Forgive me for
continuing this discussion, but i have spent today going
over the NS6 issue with tables/images/spacers, and now it
dawns on me that this issue has not begun with XHTML 1.0 at
all, but with HTML 4.0
That's correct.
Post by Frank Visser
- either declare use of a transitional DTD, but do not
include a URL (Meyer on Netscape's own DevEdge),
Yes.
Post by Frank Visser
- or don't use a DOCTYPE at all (Eric Meijer on O'Reilly
site)
In this case, your document would be HTML at all, but
so-called 'tagsoup'.
Post by Frank Visser
It all boils down to a Netscape 6 "bug" (it even has a
number: #42525, see http://bugzilla.mozilla.org/
No, this bug was fixed almost one and a half year ago, and the
bug report is closed.

) or a
Post by Frank Visser
"purist interpretation of CSS2" (according to jaylard at
evolt.org), but the upshot is: i cannot use XHTML
(transitional)
Yes, you can, *if* you write in a way the makes it display like
you want in Netscape. This is not impossible!
Post by Frank Visser
code simply because Netscape 6 refuses to
recognize it as such
No, Netscape correctly parses and presents it as a (X)HTML
Transitional document.
Post by Frank Visser
(i.e. mixed XHTML and presentational
tags), and insists on rendering it as strict code!?
No, it renders it as a Transitional document, but it renders it
in 'standard mode', i.e. correctly as by the CSS specs. The
names 'Strict' and 'Transitional' are only names given to
various 'versions' of HTML 4.01 and XHTML 1.0.
Post by Frank Visser
Where is the logic behind all this?
The logic is this:

There are billions of pages with broken CSS (and HTML) out
there. These were rendered incorrectly by most older browsers,
but they were rendered *the way the Web designers wanted* (the
designers naturally assumed the browsers rendered their
document correctly). If newer browsers suddenly started to
render documents in standards-compliant mode, all these
billions of pages would break.

One solution would be to *never* support the CSS 2 standard.
Obviously a bad solution! Another solution is to only render
documents in standard-compliant mode if they are using a full
DOCTYPE declaration. The assumption is that the authors
who write pages with such a DOCTYPE, knows the various Web
standards, wishes to follow them, and wants his/her documents
to be rendered correctly (a very reasonable assumption, I might
add).
Post by Frank Visser
It simply does not make
sense to me. Isn't the essence of a "transitional" page
that it does NOT conform strictly to the rules of future
coding?
No.
Post by Frank Visser
Now for XHTML you seem to say, leaving out the URL does not
help, for XHTML is always renderend standard (i.e. strict),
instead of quirky (i.e. loose).
Yes. If you write XHTML documents, you obviously want to follow
the standards, and then there's no need for buggy CSS
rendering.
Post by Frank Visser
Instead I should just stick to HTML 4.0 coding, and leave
it at that?
Yes. Or change your style sheet so that the document is
displayed *even* in standards compatible mode (luckily, it will
probably still be rendered the way you want it to in older
browsers).
Post by Frank Visser
What about leaving out the DOCTYPE declaration at all, for
the moment (as Meyer seems to advise)? That seems to fix my
problems, and I can still have clean XHTML on my pages.
No, a DOCTYPE declaration is *required* for all XHTML (and HTML
4.x) documents. I advise you to run you documents through
<URL: http://validator.w3.org/ >.

Also note that other browsers, such as IE 6, uses the DOCTYPE
to choose if standards-compliant or 'compatible' rendering
should be used. You can find more information on 'DOCTYPE
switching' at <URL:
http://gutfeldt.ch/matthias/articles/doctypeswitch.html >.
--
Karl Ove Hufthammer
Karl Ove Hufthammer
2002-03-25 08:03:05 UTC
Permalink
Post by Frank Visser
but shouldn't people who use Tidy be warned explicitly that
(1) you cannot just tidy your code with it but (2) you
should also add stylesheet fixes for images if you don't
want to run into trouble with Netscape 6 (and IE 6)?
No, we should assume that the user uses correct style sheets.
And 'Standards mode' in the various browsers has many other
difference from 'quirks mode' than just space around images.
For example, IE 6 only uses the CSS 1 box model (supported in
Opera from version 3.6!) if you use a 'standard mode'
DOCTYPE. Opera uses this box model for *every* DOCTYPE (and
missing DOCTYPE).

Other examples (from Netscape 6.x):

* Miscellaneous & Style
o All of the style rules in quirk.css apply.
o Stylesheets linked in the document with an
advisory mime type of text/css will still be
treated as CSS even if the server gives a
Content-Type header other than text/css.
o The CSS parser accepts colors not beginning with
#. o The CSS parser interprets unitless numbers as
px (except for font-size because that was what Nav4
did, and except for line-height and any other
properties where they have distinct meaning).
o HTML colors are parsed differently (# is not
required, and missing digits are filled int
differently) o An empty string for the background
attribute sets the background URL to empty only in
quirks mode. o System fonts work differently in
navquirks mode (shouldn't the form controls that
use them be the ones working differently instead?).
o HTML (1-7) and CSS (xx-small - xx-large) font
sizes are calculated slightly differently (see bug
18136). o For bug 18817, quirks mode was intended
to accept stylesheets in certain cases when strict
mode did not, but the code has since gotten
somewhat mangled and doesn't seem to have much
point anymore.
* Block and Inline layout
o Line height (not line-height) calculations are
different to fix bug 5821 and bug 24186 (some other
issues are described in bug 22274). o There are a
bunch of quirks to get percentage heights on images
and tables to "work" (the way they did in Nav4),
some of which may cause other effects (see bug
54119). o The HR element is treated differently in
quirks and strict mode (and arguably wrong in
both).
* Tables
o Table background colors work differently (see bug
bug 4510) It is not clear that this quirk is
needed. o In quirks mode absmiddle (handled
incorrectly?) and middle (perhaps incorrectly as
well?) are accepted as values of align on table
cells, and absmiddle, abscenter, and middle are
supported on tables (treated the same as center).
o TD, TH, TR, THEAD, TBODY, and TFOOT elements have
the document background (and color?) applied to
them (when the document background is specified in
certain ways?) (see also bug 70831 ). o The
empty-cells property defaults to hide in quirks
mode but show (according to CSS2 errata) in strict
mode (see bug 33244) (though the correct fix would
be to specify it on the HTML TABLE element in
quirk.css). o In quirks mode floated tables never
move to the next "line" if they don't fit next to
other floats, they just keep widening the page (see
bug 43086). o In quirks mode colspan="0" and
rowspan="0" are intentionally not handled as
described in HTML4. o hspace and vspace are
supported on TABLE only in quirks mode. o In quirks
mode, when tables have a border style of inset or
outset, the border color is based on the background
color of the table or of the nearest ancestor with
non-transparent background. o In quirks mode table
cells with a border have a minimum width of one
pixel. o The basic table layout strategy ignores
padding (on what) in quirks mode. o The basic table
layout strategy handles widths differently in some
way.
* Forms
o Button inputs calculate their size differently.
o In standard mode a BUTTON element (?) can submit
only if it lacks a type attribute. o Text inputs
(and other form controls containing text???)
calculate their size differently (see bug X for the
description of the original fix and also bug X for
suggested modifications) o The fonts for button
INPUT elements and SELECT elements are computed
differently. o The requirement of HTML that one
button in a radio group is always selected (by
default) is not enforced in quirks mode.
* Frames
o In quirks mode marginwidth and marginheight on a
FRAME are propagated to the contained BODY.
o In a frame size specification 0* is treated as 1*
(see bug 40383). o The scrolling attribute on FRAME
is handled differently.
--
Karl Ove Hufthammer
Charles Reitzel
2002-03-25 14:03:04 UTC
Permalink
There are many implications for Tidy in what you say.

First, our browser support probably needs to expand to include NS 6. Some
folks have done occasional testing on it, but it isn't a primary
target. But, as you point out, NS 6 has different modes of
rendering. Thus, NS 6 is really two browsers, depending on the flavor of
markup fed to it.

Further, it appears IE 6 also varies in its behavior in similar ways -
perhaps for similar reasons.

It seems like our testing process / framework should address this dichotomy
in some way. Because the differences are visual, it is difficult at best
to automate. Thus, all the more reason, to organize the tests well. Ideas?

Question: can we safely drop support for NS 4.x in favor of NS 6.x. The
same for IE? This would follow the recommendations of the Web Standards
project's browser upgrade initiative. See
http://archive.webstandards.org/upgrade/.

Still further, I agree that, at present, Tidy has no choice but to assume
that the CSS styles are correct. As we all increasingly depend on CSS, a
CSS Tidy might be really useful. Such a thing might be based on the W3C
SAC, C binding . Probably a "back burner" item, but something I wished I
had often enough.

take it easy,
Charlie
Post by Karl Ove Hufthammer
Post by Frank Visser
but shouldn't people who use Tidy be warned explicitly that
(1) you cannot just tidy your code with it but (2) you
should also add stylesheet fixes for images if you don't
want to run into trouble with Netscape 6 (and IE 6)?
No, we should assume that the user uses correct style sheets.
And 'Standards mode' in the various browsers has many other
difference from 'quirks mode' than just space around images.
For example, IE 6 only uses the CSS 1 box model (supported in
Opera from version 3.6!) if you use a 'standard mode'
DOCTYPE. Opera uses this box model for *every* DOCTYPE (and
missing DOCTYPE).
* Miscellaneous & Style
o All of the style rules in quirk.css apply.
o Stylesheets linked in the document with an
advisory mime type of text/css will still be
treated as CSS even if the server gives a
Content-Type header other than text/css.
o The CSS parser accepts colors not beginning with
#. o The CSS parser interprets unitless numbers as
px (except for font-size because that was what Nav4
did, and except for line-height and any other
properties where they have distinct meaning).
o HTML colors are parsed differently (# is not
required, and missing digits are filled int
differently) o An empty string for the background
attribute sets the background URL to empty only in
quirks mode. o System fonts work differently in
navquirks mode (shouldn't the form controls that
use them be the ones working differently instead?).
o HTML (1-7) and CSS (xx-small - xx-large) font
sizes are calculated slightly differently (see bug
18136). o For bug 18817, quirks mode was intended
to accept stylesheets in certain cases when strict
mode did not, but the code has since gotten
somewhat mangled and doesn't seem to have much
point anymore.
* Block and Inline layout
o Line height (not line-height) calculations are
different to fix bug 5821 and bug 24186 (some other
issues are described in bug 22274). o There are a
bunch of quirks to get percentage heights on images
and tables to "work" (the way they did in Nav4),
some of which may cause other effects (see bug
54119). o The HR element is treated differently in
quirks and strict mode (and arguably wrong in
both).
* Tables
o Table background colors work differently (see bug
bug 4510) It is not clear that this quirk is
needed. o In quirks mode absmiddle (handled
incorrectly?) and middle (perhaps incorrectly as
well?) are accepted as values of align on table
cells, and absmiddle, abscenter, and middle are
supported on tables (treated the same as center).
o TD, TH, TR, THEAD, TBODY, and TFOOT elements have
the document background (and color?) applied to
them (when the document background is specified in
certain ways?) (see also bug 70831 ). o The
empty-cells property defaults to hide in quirks
mode but show (according to CSS2 errata) in strict
mode (see bug 33244) (though the correct fix would
be to specify it on the HTML TABLE element in
quirk.css). o In quirks mode floated tables never
move to the next "line" if they don't fit next to
other floats, they just keep widening the page (see
bug 43086). o In quirks mode colspan="0" and
rowspan="0" are intentionally not handled as
described in HTML4. o hspace and vspace are
supported on TABLE only in quirks mode. o In quirks
mode, when tables have a border style of inset or
outset, the border color is based on the background
color of the table or of the nearest ancestor with
non-transparent background. o In quirks mode table
cells with a border have a minimum width of one
pixel. o The basic table layout strategy ignores
padding (on what) in quirks mode. o The basic table
layout strategy handles widths differently in some
way.
* Forms
o Button inputs calculate their size differently.
o In standard mode a BUTTON element (?) can submit
only if it lacks a type attribute. o Text inputs
(and other form controls containing text???)
calculate their size differently (see bug X for the
description of the original fix and also bug X for
suggested modifications) o The fonts for button
INPUT elements and SELECT elements are computed
differently. o The requirement of HTML that one
button in a radio group is always selected (by
default) is not enforced in quirks mode.
* Frames
o In quirks mode marginwidth and marginheight on a
FRAME are propagated to the contained BODY.
o In a frame size specification 0* is treated as 1*
(see bug 40383). o The scrolling attribute on FRAME
is handled differently.
--
Karl Ove Hufthammer
_______________________________________________
Tidy-develop mailing list
https://lists.sourceforge.net/lists/listinfo/tidy-develop
Karl Ove Hufthammer
2002-03-25 14:57:10 UTC
Permalink
Post by Charles Reitzel
There are many implications for Tidy in what you say.
First, our browser support probably needs to expand to
include NS 6.
Yes.
Post by Charles Reitzel
Some folks have done occasional testing on
it, but it isn't a primary target. But, as you point out,
NS 6 has different modes of rendering. Thus, NS 6 is
really two browsers, depending on the flavor of markup fed
to it.
And so is IE 6 for Windows and IE 5 for Mac.
Post by Charles Reitzel
Further, it appears IE 6 also varies in its behavior in
similar ways - perhaps for similar reasons.
Yes. But note that *if* you write your CSS to work with
Navigator 6.x / Mozilla, Opera and IE 6, they will *probably*
still work correctly in IE 5, so this isn't *that* much of a
problem! Here's an example:

In the CSS box model, the 'width' property specifies the width
of the *content* box, and margins, padding and borders are
*added* to this width. But in IE 5 (and earlier versions), the
'width' was incorrectly seen as specifying the width of the
*whole* box, *including* margins, padding and border. So the
following would create to columns, side by side in IE 5:

<div id="column1">Content ...</div>
<div id="column2">Content ...</div>

div#column1
{
width: 50%;
padding-right: 1em;
border-right: 1px solid black;
margin-right: 1em;
float: left;
}

div#column2
{
width: 50%;
margin-left: 1em;
float: right;
}

But according to CSS (1 and 2), the total width of the two
'div's would take up 50% + 50% + 3em + 1px > 100 % of the
available width, so the second column would drop *below* the
first column. To fix this, you can create nested boxes, where
you specify the width on the outer boxes, and margins, borders
and padding on the inside boxes:

<div id="outside1">
<div id="inside1">Content ...</div>
</div>

<div id="outside2">
<div id="inside2">Content ...</div>
</div>

div#outside1
{
width: 50%;
float: left;
}

div#outside2
{
width: 50%;
float: right;
}

div#inside1
{
padding-right: 1em;
border-right: 1px solid black;
margin-right: 1em;
}

div#inside2
{
margin-left: 1em;
}

This will work in IE 6, Mozilla and Opera, but it will *still*
work in IE 5, even with its broken box model (since you never
specify the 'width' and the 'border'/'margin'/'padding' on the
same element).
Post by Charles Reitzel
It seems like our testing process / framework should
address this dichotomy in some way. Because the
differences are visual, it is difficult at best to
automate.
Yes. And you can never *know* what the author intended to do
with the style sheet.
Post by Charles Reitzel
Question: can we safely drop support for NS 4.x in favor of
NS 6.x.
As long the pages work correctly in non-CSS browsers
(e.g. Lynx), you can just make sure NS 4.x doesn't get served
any CSS, and it will render it as normal HTML.
Post by Charles Reitzel
Still further, I agree that, at present, Tidy has no choice
but to assume that the CSS styles are correct. As we all
increasingly depend on CSS, a CSS Tidy might be really
useful.
I agree. It could catch some common mistakes such as ommiting
units (e.g. '50' --> '50px'), the # on colours ('color: 123456'
--> 'color: #123456'), illegal color names ('burlywood' -->
'#DEB887'), and could reformat the CSS (indentation). (I have,
BTW, several times used the CSS validator for this latter
purpose. It works great.)
--
Karl Ove Hufthammer
Charles Reitzel
2002-03-25 16:42:27 UTC
Permalink
Thanks for the stylesheet tips. I'll put them to good use. Also, thanks
for clarifying the "box" issue for me. I'll be sure to check out the CSS
validator.

Anyway, any ideas about how to organize visual tests for Tidy would be most
welcome. I'm thinking we need tests where, in addition to the expected
output markup, the desired visual effect is known and documented. Thus,
after the markup is verified correct (text compare), testers can verify
visual behavior on any given browser.

take it easy,
Charlie
Post by Karl Ove Hufthammer
Post by Charles Reitzel
There are many implications for Tidy in what you say.
First, our browser support probably needs to expand to
include NS 6.
Yes.
Post by Charles Reitzel
Some folks have done occasional testing on
it, but it isn't a primary target. But, as you point out,
NS 6 has different modes of rendering. Thus, NS 6 is
really two browsers, depending on the flavor of markup fed
to it.
And so is IE 6 for Windows and IE 5 for Mac.
Post by Charles Reitzel
Further, it appears IE 6 also varies in its behavior in
similar ways - perhaps for similar reasons.
Yes. But note that *if* you write your CSS to work with
Navigator 6.x / Mozilla, Opera and IE 6, they will *probably*
still work correctly in IE 5, so this isn't *that* much of a
In the CSS box model, the 'width' property specifies the width
of the *content* box, and margins, padding and borders are
*added* to this width. But in IE 5 (and earlier versions), the
'width' was incorrectly seen as specifying the width of the
*whole* box, *including* margins, padding and border. So the
<div id="column1">Content ...</div>
<div id="column2">Content ...</div>
div#column1
{
width: 50%;
padding-right: 1em;
border-right: 1px solid black;
margin-right: 1em;
float: left;
}
div#column2
{
width: 50%;
margin-left: 1em;
float: right;
}
But according to CSS (1 and 2), the total width of the two
'div's would take up 50% + 50% + 3em + 1px > 100 % of the
available width, so the second column would drop *below* the
first column. To fix this, you can create nested boxes, where
you specify the width on the outer boxes, and margins, borders
<div id="outside1">
<div id="inside1">Content ...</div>
</div>
<div id="outside2">
<div id="inside2">Content ...</div>
</div>
div#outside1
{
width: 50%;
float: left;
}
div#outside2
{
width: 50%;
float: right;
}
div#inside1
{
padding-right: 1em;
border-right: 1px solid black;
margin-right: 1em;
}
div#inside2
{
margin-left: 1em;
}
This will work in IE 6, Mozilla and Opera, but it will *still*
work in IE 5, even with its broken box model (since you never
specify the 'width' and the 'border'/'margin'/'padding' on the
same element).
Post by Charles Reitzel
It seems like our testing process / framework should
address this dichotomy in some way. Because the
differences are visual, it is difficult at best to
automate.
Yes. And you can never *know* what the author intended to do
with the style sheet.
Post by Charles Reitzel
Question: can we safely drop support for NS 4.x in favor of
NS 6.x.
As long the pages work correctly in non-CSS browsers
(e.g. Lynx), you can just make sure NS 4.x doesn't get served
any CSS, and it will render it as normal HTML.
Post by Charles Reitzel
Still further, I agree that, at present, Tidy has no choice
but to assume that the CSS styles are correct. As we all
increasingly depend on CSS, a CSS Tidy might be really
useful.
I agree. It could catch some common mistakes such as ommiting
units (e.g. '50' --> '50px'), the # on colours ('color: 123456'
--> 'color: #123456'), illegal color names ('burlywood' -->
'#DEB887'), and could reformat the CSS (indentation). (I have,
BTW, several times used the CSS validator for this latter
purpose. It works great.)
--
Karl Ove Hufthammer
_______________________________________________
Tidy-develop mailing list
https://lists.sourceforge.net/lists/listinfo/tidy-develop
Loading...