javascriptdomgoogle-chrome-extension

How to extract text content preserving its formatting using DOM


I am developing a Chrome extension that can extract job descriptions from LinkedIn job posts. However, when I use .textContent or .innerText in DOM manipulation to extract the job description, the output does not match the formatting or appearance of manually copying and pasting the job description into a document. How can I resolve this issue?


Solution

  • As @dandavis says, I think you can do element.innerHTML to preserve formatting if what you mean by formatting is: list, font weight, font size, etc.

    For example, for this page: https://www.linkedin.com/jobs/collections/hiring-in-network/?currentJobId=4093133656

    I can do document.querySelector('article').innerHTML to obtain the job description with it's format as shown here:

    <div class="jobs-description__content jobs-description-content
    ">
        <div class="jobs-box__html-content
      gCeSRvuwijzmTJHsHVmaREDPgtRKEJIw
      t-14 t-normal
      jobs-description-content__text--stretch" id="job-details" tabindex="-1">
            <h2 class="text-heading-large">
                About the job
            </h2>
    
            <!----> <div class="mt4">
                <p dir="ltr">
                    <span><p><!---->Hi! We are GeekGarden, a consultant IT
                            company.<!----></p></span><span><p><!---->We're hiring
                            the best candidate for
                            position:<!----></p></span><span><p><span><br></span></p></span><span><p><!---->Backend
                            Developer Golang<!----><span><strong><span
                                        class="white-space-pre">
                                    </span>OR<!----></strong></span><span
                                class="white-space-pre">
                            </span>Ruby<!----></p></span><span><p><span><br></span></p></span><span><p><!---->This
                            position is for our client : SaaS
                            company<!----></p></span><span><p><!---->Onsite
                            Yogyakarta<!----></p></span><span><p><!---->For 6 months
                            contract
                            base<!----></p></span><span><p><span><br></span></p></span><span><p><!---->Job
                            Descriptions<!----></p></span><span>
                        <ul><span><li><!---->Design, develop, and maintain backend
                                    services and APIs using Golang or using Ruby on
                                    Rails<!----></li></span><span><li><!---->Write
                                    clean, maintainable, and efficient
                                    code<!----></li></span><span><li><!---->Implement
                                    scalable server-side logic and optimize
                                    applications for
                                    performance<!----></li></span><span><li><!---->Collaborate
                                    with the product team to understand requirements
                                    and translate them into technical
                                    specifications<!----></li></span><span><li><!---->Integrate
                                    third-party APIs and services as
                                    needed<!----></li></span><span><li><!---->Identify,
                                    troubleshoot, and resolve performance
                                    bottlenecks and
                                    bugs<!----></li></span><span><li><!---->Conduct
                                    code reviews and provide constructive feedback
                                    to other team
                                    members<!----></li></span><span><li><!---->Ensure
                                    data security, system scalability, and high
                                    availability<!----></li></span><span><li><!---->Participate
                                    in the full software development lifecycle,
                                    including testing and
                                    deployment<!----></li></span></ul>
                    </span><span><p><span><br></span></p></span><span><p><!---->Job
                            Requirements<!----></p></span><span>
                        <ul><span><li><!---->Bachelor's degree in Computer Science,
                                    Engineering, or a related field (or equivalent
                                    experience)<!----></li></span><span><li><!---->3+
                                    years of proven experience as a Backend
                                    Developer<!----></li></span><span><li><!---->Strong
                                    expertise in Golang or Ruby on Rails programming
                                    languages<!----></li></span><span><li><!---->Experience
                                    with RESTful APIs, microservices architecture,
                                    and web service
                                    frameworks<!----></li></span><span><li><!---->Familiarity
                                    with SQL and NoSQL databases (e.g., PostgreSQL,
                                    MySQL, MongoDB,
                                    Redis)<!----></li></span><span><li><!---->Understanding
                                    of version control systems (e.g.,
                                    Git)<!----></li></span><span><li><!---->Strong
                                    problem-solving skills and the ability to write
                                    efficient, scalable
                                    code<!----></li></span><span><li><span><strong><!---->Willing
                                            to work with project-based
                                            contract<!----></strong></span></li></span><span><li><span><strong><!---->Willing
                                            to join
                                            immediately<!----></strong></span></li></span><span><li><span><strong><!---->Willing
                                            to work onsite in
                                            Yogyakarta<!----></strong></span></li></span></ul>
                    </span>
                </p>
                <!----> </div>
        </div>
        <div class="jobs-description__details">
            <!----> </div>
    </div>
    
    <div class="jobs-description__content jobs-description-content
    ">
        <div class="jobs-box__html-content
      gCeSRvuwijzmTJHsHVmaREDPgtRKEJIw
      t-14 t-normal
      jobs-description-content__text--stretch" id="job-details" tabindex="-1">
            <h2 class="text-heading-large">
                About the job
            </h2>
    
            <!----> <div class="mt4">
                <p dir="ltr">
                    <span><p><!---->Hi! We are GeekGarden, a consultant IT
                            company.<!----></p></span><span><p><!---->We're hiring
                            the best candidate for
                            position:<!----></p></span><span><p><span><br></span></p></span><span><p><!---->Backend
                            Developer Golang<!----><span><strong><span
                                        class="white-space-pre">
                                    </span>OR<!----></strong></span><span
                                class="white-space-pre">
                            </span>Ruby<!----></p></span><span><p><span><br></span></p></span><span><p><!---->This
                            position is for our client : SaaS
                            company<!----></p></span><span><p><!---->Onsite
                            Yogyakarta<!----></p></span><span><p><!---->For 6 months
                            contract
                            base<!----></p></span><span><p><span><br></span></p></span><span><p><!---->Job
                            Descriptions<!----></p></span><span>
                        <ul><span><li><!---->Design, develop, and maintain backend
                                    services and APIs using Golang or using Ruby on
                                    Rails<!----></li></span><span><li><!---->Write
                                    clean, maintainable, and efficient
                                    code<!----></li></span><span><li><!---->Implement
                                    scalable server-side logic and optimize
                                    applications for
                                    performance<!----></li></span><span><li><!---->Collaborate
                                    with the product team to understand requirements
                                    and translate them into technical
                                    specifications<!----></li></span><span><li><!---->Integrate
                                    third-party APIs and services as
                                    needed<!----></li></span><span><li><!---->Identify,
                                    troubleshoot, and resolve performance
                                    bottlenecks and
                                    bugs<!----></li></span><span><li><!---->Conduct
                                    code reviews and provide constructive feedback
                                    to other team
                                    members<!----></li></span><span><li><!---->Ensure
                                    data security, system scalability, and high
                                    availability<!----></li></span><span><li><!---->Participate
                                    in the full software development lifecycle,
                                    including testing and
                                    deployment<!----></li></span></ul>
                    </span><span><p><span><br></span></p></span><span><p><!---->Job
                            Requirements<!----></p></span><span>
                        <ul><span><li><!---->Bachelor's degree in Computer Science,
                                    Engineering, or a related field (or equivalent
                                    experience)<!----></li></span><span><li><!---->3+
                                    years of proven experience as a Backend
                                    Developer<!----></li></span><span><li><!---->Strong
                                    expertise in Golang or Ruby on Rails programming
                                    languages<!----></li></span><span><li><!---->Experience
                                    with RESTful APIs, microservices architecture,
                                    and web service
                                    frameworks<!----></li></span><span><li><!---->Familiarity
                                    with SQL and NoSQL databases (e.g., PostgreSQL,
                                    MySQL, MongoDB,
                                    Redis)<!----></li></span><span><li><!---->Understanding
                                    of version control systems (e.g.,
                                    Git)<!----></li></span><span><li><!---->Strong
                                    problem-solving skills and the ability to write
                                    efficient, scalable
                                    code<!----></li></span><span><li><span><strong><!---->Willing
                                            to work with project-based
                                            contract<!----></strong></span></li></span><span><li><span><strong><!---->Willing
                                            to join
                                            immediately<!----></strong></span></li></span><span><li><span><strong><!---->Willing
                                            to work onsite in
                                            Yogyakarta<!----></strong></span></li></span></ul>
                    </span>
                </p>
                <!----> </div>
        </div>
        <div class="jobs-description__details">
            <!----> </div>
    </div>

    If this is not what you're looking for, then please specify what you mean by format