Package org.w3c.tidy
Class Configuration
java.lang.Object
org.w3c.tidy.Configuration
- All Implemented Interfaces:
Serializable
Read configuration file and manage configuration properties. Configuration files associate a property name with a
value. The format is that of a Java .properties file.
- Version:
- $Revision: 817 $ ($Author: steffenyount $)
- Author:
- Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Stringdefault text for alt attribute.static final intDeprecated.protected booleanconvert quotes and dashes to nearest ASCII char.static final intDeprecated.protected booleanoutput BODY content only.protected booleano/p newline before br or not?protected booleancreate slides on each h2 element.protected StringCSS class naming for -clean option.protected inttrack what types of tags user has defined to eliminate unnecessary searches.static final inttreatment of doctype: auto.static final inttreatment of doctype: loose.static final inttreatment of doctype: omit.static final inttreatment of doctype: strict.static final inttreatment of doctype: user.protected intsee doctype property.protected Stringuser specified doctype.protected booleandiscard empty p elements.protected booleandiscard presentation tags.protected booleandiscard proprietary attributes.protected intKeep first or last duplicate attribute.protected booleanif true format error output for GNU Emacs.protected booleanif yes text in blocks is wrapped in p's.protected booleanif yes text at body is wrapped in p's.protected Stringfile name to write errors to.protected booleanreplace CDATA sections with escaped text.protected booleanfix URLs by replacing \ with /.protected booleanfix comments with adjacent hyphens.protected booleanproperly escape URLs.protected booleanoutput document even if errors were found.protected booleanhides all (real) comments in output.protected booleansuppress optional end tags.protected booleanoutput plain-old HTML, even for XHTML input.protected booleannewline+indent before each attribute.protected booleanindent CDATA sections.protected booleanindent content of appropriate tags.static final intDeprecated.protected booleanjoin multiple class attributes.protected booleanjoin multiple style attributes.static final intKeep first duplicate attribute.static final intKeep last duplicate attribute.protected booleanif yes last modied time is preserved.protected StringRJ language property.static final intDeprecated.protected booleanif true attributes may use newlines.protected booleanreplace i by em and b by strong.protected booleanfolds known attribute values to lower case.static final intDeprecated.protected booleanMake bare HTML: remove Microsoft cruft.protected booleanremove presentational clutter.protected booleanallow numeric character references.protected char[]bytes for the newline marker.protected booleanuse numeric entities.protected booleanif true normal output is suppressed.protected booleanno 'Parsing X', guessed DTD or summary.protected booleanoutput naked ampersand as &.protected booleanoutput " marks as ".protected booleanoutput non-breaking space as entity.static final intDeprecated.protected booleanAvoid mapping values > 127 to entities.protected booleanreplace hex color attribute values with names.protected Stringchar encoding used when replacing illegal SGML chars, regardless of specified encoding.protected ReportReport instance.static final intDeprecated.protected intnumber of errors to put out.protected booleanhowever errors are always shown.protected StringDeprecated.does nothingprotected booleandoes text/block level content effect indentation.protected intdefault indentation.protected intdefault tab size (8).protected booleanadd meta element indicating tidied doc.protected booleantrim empty elements.protected TagTableTagTable associated with this Configuration.protected booleanoutput attributes in upper not lower case.protected booleanoutput tags in upper not lower case.static final intDeprecated.static final intDeprecated.static final intDeprecated.static final intDeprecated.static final intDeprecated.protected booleandraconian cleaning for Word2000.protected booleanwrap within ASP pseudo elements.protected booleanwrap within attribute values.protected booleanwrap within JSTE pseudo elements.protected intdefault wrap margin (68).protected booleanwrap within PHP pseudo elements.protected booleanwrap within JavaScript string literals.protected booleanwrap within CDATA section tags.protected booleanif true then output tidied markup.protected booleanoutput extensible HTML.protected booleancreate output as XML.protected booleanadd<?xml?>for XML docs.protected booleanIf set to yes PIs must end with?>.protected booleanif set to yes adds xml:space attr as needed.protected booleantreat input as XML. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedConfiguration(Report report) Instantiates a new Configuration. -
Method Summary
Modifier and TypeMethodDescriptionvoidadds configuration Properties.voidadjust()Ensure that config is self consistent.protected StringconvertCharEncoding(int code) Convert a char encoding from the deprecated tidy constant to a standard java encoding name.protected StringGetter forinCharEncodingName.protected StringGetter foroutCharEncodingName.static booleanisKnownOption(String name) Is the given String a valid configuration flag?voidParses a property file.voidprintConfigOptions(Writer errout, boolean showActualConfiguration) prints available configuration options.protected voidsetInCharEncoding(int encoding) Deprecated.use setInCharEncodingName(String)protected voidsetInCharEncodingName(String encoding) Setter forinCharEncodingName.protected voidsetInOutEncodingName(String encoding) Setter forinOutCharEncodingName.protected voidsetOutCharEncoding(int encoding) Deprecated.use setOutCharEncodingName(String)protected voidsetOutCharEncodingName(String encoding) Setter foroutCharEncodingName.
-
Field Details
-
RAW
public static final int RAWDeprecated.useTidy.setRawOut(true)for raw outputcharacter encoding = RAW.- See Also:
-
ASCII
public static final int ASCIIDeprecated.character encoding = ASCII.- See Also:
-
LATIN1
public static final int LATIN1Deprecated.character encoding = LATIN1.- See Also:
-
UTF8
public static final int UTF8Deprecated.character encoding = UTF8.- See Also:
-
ISO2022
public static final int ISO2022Deprecated.character encoding = ISO2022.- See Also:
-
MACROMAN
public static final int MACROMANDeprecated.character encoding = MACROMAN.- See Also:
-
UTF16LE
public static final int UTF16LEDeprecated.character encoding = UTF16LE.- See Also:
-
UTF16BE
public static final int UTF16BEDeprecated.character encoding = UTF16BE.- See Also:
-
UTF16
public static final int UTF16Deprecated.character encoding = UTF16.- See Also:
-
WIN1252
public static final int WIN1252Deprecated.character encoding = WIN1252.- See Also:
-
BIG5
public static final int BIG5Deprecated.character encoding = BIG5.- See Also:
-
SHIFTJIS
public static final int SHIFTJISDeprecated.character encoding = SHIFTJIS.- See Also:
-
DOCTYPE_OMIT
public static final int DOCTYPE_OMITtreatment of doctype: omit.- See Also:
- To Do:
- should be an enumeration DocTypeMode
-
DOCTYPE_AUTO
public static final int DOCTYPE_AUTOtreatment of doctype: auto.- See Also:
-
DOCTYPE_STRICT
public static final int DOCTYPE_STRICTtreatment of doctype: strict.- See Also:
-
DOCTYPE_LOOSE
public static final int DOCTYPE_LOOSEtreatment of doctype: loose.- See Also:
-
DOCTYPE_USER
public static final int DOCTYPE_USERtreatment of doctype: user.- See Also:
-
KEEP_LAST
public static final int KEEP_LASTKeep last duplicate attribute.- See Also:
- To Do:
- should be an enumeration DupAttrMode
-
KEEP_FIRST
public static final int KEEP_FIRSTKeep first duplicate attribute.- See Also:
-
spaces
protected int spacesdefault indentation. -
wraplen
protected int wraplendefault wrap margin (68). -
tabsize
protected int tabsizedefault tab size (8). -
docTypeMode
protected int docTypeModesee doctype property. -
duplicateAttrs
protected int duplicateAttrsKeep first or last duplicate attribute. -
altText
default text for alt attribute. -
slidestyle
Deprecated.does nothingstyle sheet for slides. -
language
RJ language property. -
docTypeStr
user specified doctype. -
errfile
file name to write errors to. -
writeback
protected boolean writebackif true then output tidied markup. -
onlyErrors
protected boolean onlyErrorsif true normal output is suppressed. -
showWarnings
protected boolean showWarningshowever errors are always shown. -
quiet
protected boolean quietno 'Parsing X', guessed DTD or summary. -
indentContent
protected boolean indentContentindent content of appropriate tags. -
smartIndent
protected boolean smartIndentdoes text/block level content effect indentation. -
hideEndTags
protected boolean hideEndTagssuppress optional end tags. -
xmlTags
protected boolean xmlTagstreat input as XML. -
xmlOut
protected boolean xmlOutcreate output as XML. -
xHTML
protected boolean xHTMLoutput extensible HTML. -
htmlOut
protected boolean htmlOutoutput plain-old HTML, even for XHTML input. Yes means set explicitly. -
xmlPi
protected boolean xmlPiadd<?xml?>for XML docs. -
upperCaseTags
protected boolean upperCaseTagsoutput tags in upper not lower case. -
upperCaseAttrs
protected boolean upperCaseAttrsoutput attributes in upper not lower case. -
makeClean
protected boolean makeCleanremove presentational clutter. -
makeBare
protected boolean makeBareMake bare HTML: remove Microsoft cruft. -
logicalEmphasis
protected boolean logicalEmphasisreplace i by em and b by strong. -
dropFontTags
protected boolean dropFontTagsdiscard presentation tags. -
dropProprietaryAttributes
protected boolean dropProprietaryAttributesdiscard proprietary attributes. -
dropEmptyParas
protected boolean dropEmptyParasdiscard empty p elements. -
fixComments
protected boolean fixCommentsfix comments with adjacent hyphens. -
trimEmpty
protected boolean trimEmptytrim empty elements. -
breakBeforeBR
protected boolean breakBeforeBRo/p newline before br or not? -
burstSlides
protected boolean burstSlidescreate slides on each h2 element. -
numEntities
protected boolean numEntitiesuse numeric entities. -
quoteMarks
protected boolean quoteMarksoutput " marks as ". -
quoteNbsp
protected boolean quoteNbspoutput non-breaking space as entity. -
quoteAmpersand
protected boolean quoteAmpersandoutput naked ampersand as &. -
wrapAttVals
protected boolean wrapAttValswrap within attribute values. -
wrapScriptlets
protected boolean wrapScriptletswrap within JavaScript string literals. -
wrapSection
protected boolean wrapSectionwrap within CDATA section tags. -
wrapAsp
protected boolean wrapAspwrap within ASP pseudo elements. -
wrapJste
protected boolean wrapJstewrap within JSTE pseudo elements. -
wrapPhp
protected boolean wrapPhpwrap within PHP pseudo elements. -
fixBackslash
protected boolean fixBackslashfix URLs by replacing \ with /. -
indentAttributes
protected boolean indentAttributesnewline+indent before each attribute. -
xmlPIs
protected boolean xmlPIsIf set to yes PIs must end with?>. -
xmlSpace
protected boolean xmlSpaceif set to yes adds xml:space attr as needed. -
encloseBodyText
protected boolean encloseBodyTextif yes text at body is wrapped in p's. -
encloseBlockText
protected boolean encloseBlockTextif yes text in blocks is wrapped in p's. -
keepFileTimes
protected boolean keepFileTimesif yes last modied time is preserved. -
word2000
protected boolean word2000draconian cleaning for Word2000. -
tidyMark
protected boolean tidyMarkadd meta element indicating tidied doc. -
emacs
protected boolean emacsif true format error output for GNU Emacs. -
literalAttribs
protected boolean literalAttribsif true attributes may use newlines. -
bodyOnly
protected boolean bodyOnlyoutput BODY content only. -
fixUri
protected boolean fixUriproperly escape URLs. -
lowerLiterals
protected boolean lowerLiteralsfolds known attribute values to lower case. -
replaceColor
protected boolean replaceColorreplace hex color attribute values with names. -
hideComments
protected boolean hideCommentshides all (real) comments in output. -
indentCdata
protected boolean indentCdataindent CDATA sections. -
forceOutput
protected boolean forceOutputoutput document even if errors were found. -
showErrors
protected int showErrorsnumber of errors to put out. -
asciiChars
protected boolean asciiCharsconvert quotes and dashes to nearest ASCII char. -
joinClasses
protected boolean joinClassesjoin multiple class attributes. -
joinStyles
protected boolean joinStylesjoin multiple style attributes. -
escapeCdata
protected boolean escapeCdatareplace CDATA sections with escaped text. -
ncr
protected boolean ncrallow numeric character references. -
cssPrefix
CSS class naming for -clean option. -
replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding. -
tt
TagTable associated with this Configuration. -
report
Report instance. Used for messages. -
definedTags
protected int definedTagstrack what types of tags user has defined to eliminate unnecessary searches. -
newline
protected char[] newlinebytes for the newline marker. -
rawOut
protected boolean rawOutAvoid mapping values > 127 to entities.
-
-
Constructor Details
-
Configuration
Instantiates a new Configuration. This method should be called by Tidy only.- Parameters:
report- Report instance
-
-
Method Details
-
addProps
adds configuration Properties.- Parameters:
p- Properties
-
parseFile
Parses a property file.- Parameters:
filename- file name
-
isKnownOption
Is the given String a valid configuration flag?- Parameters:
name- configuration parameter name- Returns:
trueif the given String is a valid config option
-
adjust
public void adjust()Ensure that config is self consistent. -
printConfigOptions
prints available configuration options.- Parameters:
errout- where to writeshowActualConfiguration- print actual configuration values
-
getInCharEncodingName
Getter forinCharEncodingName.- Returns:
- Returns the inCharEncodingName.
-
setInCharEncodingName
Setter forinCharEncodingName.- Parameters:
encoding- The inCharEncodingName to set.
-
getOutCharEncodingName
Getter foroutCharEncodingName.- Returns:
- Returns the outCharEncodingName.
-
setOutCharEncodingName
Setter foroutCharEncodingName.- Parameters:
encoding- The outCharEncodingName to set.
-
setInOutEncodingName
Setter forinOutCharEncodingName.- Parameters:
encoding- The CharEncodingName to set.
-
setOutCharEncoding
protected void setOutCharEncoding(int encoding) Deprecated.use setOutCharEncodingName(String)Setter foroutCharEncoding.- Parameters:
encoding- The outCharEncoding to set.
-
setInCharEncoding
protected void setInCharEncoding(int encoding) Deprecated.use setInCharEncodingName(String)Setter forinCharEncoding.- Parameters:
encoding- The inCharEncoding to set.
-
convertCharEncoding
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.- Parameters:
code- encoding code- Returns:
- encoding name
-
Tidy.setRawOut(true)for raw output