Herold converts HTML files to DocBook files. It tries to detect the structure of the HTML code by analyzing the header elements. Herold is able to suppress table elements and to serialize the contents. Furthermore, you can exclude certain elements via XPath expressions.
Release Notes: The lang attribute lang of pre elements is now preserved. Command line arguments were cleaned up New and improved profiles were provided. Creation of invalid DocBook XML when transforming an element with a nested img elements was fixed. Processing of meta elements was added, and minor fixes were made.
Licenses: GPLv3