General

Components

Community

Development

TDF

Frequently asked questions

Question: How to extracte featured text from a document, for example: extract all bold, italicized or underlined text from a document and any other layout evidence of emphasis like the text font size and so on?

Answer: The model of ODF is for there to be "blocks" of text and each run can have an associated style reference. These style references then have definitions of exactly what text attributes they correspond to. There are two methods you can refer to.

  1. First identify which styles have the bold (or italics) attribute. The document might have more than one style that defines bold text. Find which text blocks reference that style.

  2. For each text block, identify the style. For the style, resolve the underlying text attributes. If it is bold (or italics or whatever) then extract it.

Impressum (Legal Info) | Privacy Policy (Datenschutzerklärung) | Statutes (non-binding English translation) - Satzung (binding German version) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Apache License, v2.0. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License v2.0. “LibreOffice” and “The Document Foundation” are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy. LibreOffice was based on OpenOffice.org.