python3-html-text binary package in Ubuntu Oracular amd64

 How is html_text different from .xpath('//text()') from LXML or .get_text()
 from Beautiful Soup ?
 .
  * Text extracted with html_text does not contain inline styles,
    javascript, comments and other text that is not normally visible to
    users;
  * html_text normalizes whitespace, but in a way smarter than
    .xpath('normalize-space()), adding spaces around inline elements (which
    are often used as block elements in html markup), and trying to avoid
    adding extra spaces for punctuation;
  * html-text can add newlines (e.g. after headers or paragraphs), so that
    the output text looks more like how it is rendered in browsers.

Publishing history

Date Status Target Pocket Component Section Priority Phased updates Version
  2024-05-20 01:58:02 UTC Published Ubuntu Oracular amd64 release universe python Optional 0.6.2-1
  • Published
  • Copied from ubuntu oracular-proposed amd64 in Primary Archive for Ubuntu
  Deleted Ubuntu Oracular amd64 proposed universe python Optional 0.6.2-1
  • Removal requested .
  • Deleted by Ubuntu Archive Auto-Sync

    Moved to oracular

  • Published
  2024-05-20 01:58:21 UTC Superseded Ubuntu Oracular amd64 release universe python Optional 0.5.2-2
  • Removal requested .
  • Superseded by amd64 build of html-text 0.6.2-1 in ubuntu oracular PROPOSED
  • Published
  • Copied from ubuntu lunar-proposed amd64 in Primary Archive for Ubuntu