FileBuzz: Software Download
Find shareware, freeware downloads from thousands of software titles

Program Name: Jericho HTML Parser for Windows

License Type: Freeware

Date Released: July 28, 2013

Jericho HTML Parser for Windows v3.2 Instant Download

Jericho HTML Parser for Windows Desciption:


Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML. It also provides high-level HTML form manipulation functions.It is an open source library released under both the Eclipse Public License (EPL) and GNU Lesser General Public License (LGPL). You are therefore free to use it in commercial applications subject to the terms detailed in either one of these licence documents.The javadocs provide comprehensive documentation of the entire API, as well as being a very useful reference on aspects of HTML and XML in general. Features:The library distinguishes itself from other HTML parsers with the following major features: * The presence of badly formatted HTML does not interfere with the parsing of the rest of the document, which makes the library ideal for use with "real-world" HTML that chokes other parsers. * ASP, JSP, PSP, PHP and Mason server tags are explicitly recognised by the parser. This means that normal HTML is still parsed properly even if there are server tags inside them, which is common for example when dynamically setting element attributes. * A new stream based parsing option using the StreamedSource class, which allows memory efficient processing of large files using an event iterator. This is essentially a StAX alternative with the ability to process HTML and non-validating XML, as well as several other features not available in other streaming parsers. * In its standard form it is neither an event nor tree based parser, but rather uses a combination of simple text search, efficient tag recognition and a tag position cache. The text of the whole source document is first loaded into memory, and then only the relevant segments searched for the relevant characters of each search operation. * Compared to a tree based parser such as DOM, the memory and resource requirements can be far better if only small sections of the document need to be parsed or modified. Incorrect or badly formatted HTML can easily be ignored, unlike tree based parsers which must identify every node in the document from top to bottom. * Compared to an event based parser such as SAX, the interface is on a much higher level and more intuitive, and a tree representation of the document element hierarchy is easily created if required. * The begin and end positions in the source document of all parsed segments are accessible, allowing modification of only selected segments of the document without having to reconstruct the entire document from a tree. * The row and column number of each position in the source document are easily accessible. * Provides a simple but comprehensive interface for the analysis and manipulation of HTML form controls, including the extraction and population of initial values, and conversion to read-only or data display modes. Analysis of the form controls also allows data received from the form to be stored and presented in an appropriate manner. * Custom tag types can be easily defined and registered for recognition by the parser. * Built-in functionality to extract all text from HTML markup, suitable for feeding into a text search engine such as Apache Lucene. * Built-in functionality to render HTML markup with simple text formatting. * Built-in functionality to format HTML source code that indents elements according to their depth in the document element hierarchy. (Click here for an online demonstration) * Built-in functionality to compact HTML source code by removing all unnecessary white space.

License: Freeware | Price: $0.00 | Size: 2.1 MB | Downloads (112)

Platform: Win All

Related Software
New Reviews
New Downloads Top Downloads Top Search

New Downloads

FaxTalk FaxCenter Pro
Active@ DVD Eraser
EasyBilling Invoicing Software for Mac
NCheck Bio Attendance Trial for Windows
RealPopup LAN chat
RationalPlan Single Project for Mac
Zilla JPG To PDF Converter
NolaPro Free Accounting
Apeaksoft iOS Unlocker
PDF-XChange Standard
LAN Exam Maker Professional
Mgosoft PCL Converter
Mgosoft PCL Converter SDK
Mgosoft PCL Converter Command Line
AllMyNotes Organizer Portable
Antamedia HotSpot Software
ReliefJet Essentials for Outlook
Geodata Germany
Active@ Partition Recovery
Dwyco VideoChat Community
Geodata International

Top Downloads

Opera Mini
Turbo C++
Abyss Web Server X1
TaskMerlin Project Management Software
Macrorit Disk Partition Expert Server Edition
Foxit Advanced PDF Editor
Kids Online Browser
Rapid PHP Editor
simplitec simplisafe
Open-School Community Edition
Cleanup and Update Tool for Cisco CUCM
Count Code
Multi-Process Killer Portable
SCEA Part 2 & 3 Exam EPractize Labs Enterprise
Silva 2.1a2
Agama Web Menus
aXmag Free
Flash Player Pro
Red Call Recorder
Syston Data Recovery Free
Photo! 3D Album
Photo! Editor
Sondle Virtual Desktop Assist

Top Search

Html Parser Search Replace
Html Skin Windows Media Player
Html Parser Interactive
Html Templates Windows Live Mail
Html Reader Windows Mobile
Html Parser Simple
Html Control Windows Form
Html Parser Activex
Digital Clock Html Codes Windows
Drag And Drop Html Editor Windows
Using Html Page In Windows Applections
Html Editor Windows Mobile
Email Html Parser
Html Parser Editor
Html Editor In Windows Forms
Php Parser For Windows
Share Html Code Windows Live
Html Set Windows Size
Html Editor For Windows Mobile
Java Html Parser
Html Maker Windows Mobile
Html Open Windows Hide Address
Graphical Html Parser
Html Parser With Javascript Vba
Online Html Parser