python-requestspython-requests-html

Request to the HAL API


I have this request for the HAL API composed of multiple parameters to find a publication but they are not taken into account.

"https://api.archives-ouvertes.fr/search/?q=title_t:{0}&q=producedDateY_i:{1}&q=authFullName_s:{2}&wt=json&fl=doc_id,producedDateY_i,authFullName_s,title_t,doiId_s".format(keyword, year, authors)"

For example, I initialize :

keyword = "consanguinite"   # is a keyword of the title
year = "1982"
authors = "Lefebvre"

and the output is :

{'response': {'numFound': 79, 'start': 0, 'maxScore': 9.001673, 'numFoundExact': True, 'docs': [{'authFullName_s': ['M. Gillois'], 'producedDateY_i': 1996}, {'authFullName_s': ['G. Malécot'], 'producedDateY_i': 1969}, {'authFullName_s': [". Departement de Genetique Et d'Amelioration Des Plantes"], 'producedDateY_i': 1988}, {'authFullName_s': ['Marie-Luce Gélard'], 'producedDateY_i': 2009}, {'authFullName_s': ['P.L. Lefort'], 'producedDateY_i': 1988}, {'authFullName_s': ['R. Hanset'], 'producedDateY_i': 1973}, {'authFullName_s': ['Claude Chevalet', 'J.M. Cornuet'], 'producedDateY_i': 1982}, {'authFullName_s': ['Chevalet Cl.', 'Cornuet J.-M.'], 'producedDateY_i': 1982}, {'authFullName_s': ['H. Rochambeau', 'Claude C. Chevalet', 'A. Malafosse'], 'producedDateY_i': 1979}, {'authFullName_s': ['Claude Chevalet', 'Hubert de Rochambeau', 'A. Malafosse'], 'producedDateY_i': 1979},

So, I have a lot of noise in the output and I don't have the fields that I am looking for (doc_id, doiId_s..)

Do you know why this is happening and how can i fix this ? Thanks

I tried to replace the fields

title_t:{0}&q=producedDateY_i:{1}&q=authFullName_s:{2} 

by

title_t:{0}&q=producedDateY_i:{1}&q=authFullName_s:{2}label_s:(title&&authors&&year)

Solution

  • Looking at the documentation, you can use ~ (for not exact search) and * (to get all parameters):

    For example:

    import requests
    
    api_url = "https://api.archives-ouvertes.fr/search/"
    
    keyword = "consanguinite"
    year = "1982"
    authors = "Lefebvre"
    
    params = {
        "q": f"title_t:{keyword}~ AND producedDateY_i:{year} AND authFullName_t:{authors}~",
        "wt": "json",
        "fl": "producedDateY_i,authFullName_s,do*,tit*",
    }
    
    data = requests.get(api_url, params=params).json()
    print(data)
    

    Prints:

    {
        "response": {
            "numFound": 1,
            "start": 0,
            "maxScore": 12.277535,
            "numFoundExact": True,
            "docs": [
                {
                    "docid": "2780955",
                    "domainAllCode_s": ["sdv"],
                    "domain_s": ["0.sdv"],
                    "title_s": [
                        "Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations."
                    ],
                    "authFullName_s": ["J. Lefebvre"],
                    "docType_s": "COMM",
                    "producedDateY_i": 1982,
                }
            ],
        }
    }
    

    Or for all parameters:

    import requests
    
    api_url = "https://api.archives-ouvertes.fr/search/"
    
    keyword = "consanguinite"
    year = "1982"
    authors = "Lefebvre"
    
    params = {
        "q": f"title_t:{keyword}~ AND producedDateY_i:{year} AND authFullName_t:{authors}~",
        "wt": "json",
        "fl": "*",    # <-- get all parameters
    }
    
    data = requests.get(api_url, params=params).json()
    print(data)
    

    Prints:

    {
        "response": {
            "numFound": 1,
            "start": 0,
            "maxScore": 12.286329,
            "numFoundExact": True,
            "docs": [
                {
                    "docid": "2780955",
                    "label_s": "J. Lefebvre. Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations.. Informatique et Biosphere, Xeme colloque, 1982, Paris. &#x27E8;hal-02780955&#x27E9;",
                    "citationRef_s": "<i>Informatique et Biosphere, Xeme colloque</i>, 1982, Paris",
                    "citationFull_s": 'J. Lefebvre. Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations.. <i>Informatique et Biosphere, Xeme colloque</i>, 1982, Paris. <a target="_blank" href="https://hal.inrae.fr/hal-02780955">&#x27E8;hal-02780955&#x27E9;</a>',
                    "label_bibtex": "@inproceedings{lefebvre:hal-02780955,\n  TITLE = {{Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations.}},\n  AUTHOR = {Lefebvre, J.},\n  URL = {https://hal.inrae.fr/hal-02780955},\n  BOOKTITLE = {{Informatique et Biosphere, Xeme colloque}},\n  ADDRESS = {Paris},\n  PUBLISHER = {{INA-PG}},\n  SERIES = {Informatique et Biosphere, Xeme colloque, Micro informatique et biosphere},\n  YEAR = {1982},\n  HAL_ID = {hal-02780955},\n  HAL_VERSION = {v1},\n}\n",
                    "label_endnote": "%0 Conference Paper\n%T Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations.\n%A Lefebvre, J.\n%F Invité\n%< sans comité de lecture\n%B Informatique et Biosphere, Xeme colloque\n%C Paris, \n%I INA-PG\n%C Paris\n%S Informatique et Biosphere, Xeme colloque, Micro informatique et biosphere\n%8 1982\n%D 1982\n%Z Life Sciences [q-bio]Conference papers\n%G French\n%L hal-02780955\n%U https://hal.inrae.fr/hal-02780955\n%~ INRAE\n",
                    "label_coins": '<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rft.type=proceedings&amp;rft.identifier=https%3A%2F%2Fhal.inrae.fr%2Fhal-02780955&amp;rft.identifier=hal-02780955&amp;rft.identifier=prodinra%3A8214&amp;rft.title=Programme%20sur%20micro%20ordinateur%20de%20calcul%20de%20parente%20et%20de%20consanguinite%20sur%2016%20generations.&amp;rft.creator=Lefebvre%2C%20J.&amp;rft.language=fr&amp;rft.date=1982&amp;rft.source=Informatique%20et%20Biosphere%2C%20Xeme%20colloque&amp;rft.coverage=Paris&amp;rft.coverage="></span>',
                    "openAccess_bool": False,
                    "inra_publicVise_local_s": ["AU"],
                    "popularLevel_s": "0",
                    "inra_otherType_Comm_local_s": ["FT"],
                    "peerReviewing_s": "0",
                    "invitedCommunication_s": "1",
                    "audience_s": "2",
                    "domainAllCode_s": ["sdv"],
                    "level0_domain_s": ["sdv"],
                    "domain_s": ["0.sdv"],
                    "fr_domainAllCodeLabel_fs": ["sdv_FacetSep_Sciences du Vivant [q-bio]"],
                    "en_domainAllCodeLabel_fs": ["sdv_FacetSep_Life Sciences [q-bio]"],
                    "es_domainAllCodeLabel_fs": ["sdv_FacetSep_Life Sciences [q-bio]"],
                    "eu_domainAllCodeLabel_fs": ["sdv_FacetSep_domain_sdv"],
                    "primaryDomain_s": "sdv",
                    "fr_title_s": [
                        "Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations."
                    ],
                    "title_s": [
                        "Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations."
                    ],
                    "conferenceTitle_s": "Informatique et Biosphere, Xeme colloque",
                    "conferenceStartDate_s": "1982",
                    "conferenceStartDate_tdate": "1982-01-01T00:00:00Z",
                    "conferenceStartDateY_i": 1982,
                    "conferenceEndDate_s": "1982",
                    "conferenceEndDate_tdate": "1982-01-01T00:00:00Z",
                    "conferenceEndDateY_i": 1982,
                    "authIdFormPerson_s": ["159685-0"],
                    "authIdForm_i": [159685],
                    "authLastName_s": ["Lefebvre"],
                    "authFirstName_s": ["J."],
                    "authFullName_s": ["J. Lefebvre"],
                    "authLastNameFirstName_s": ["Lefebvre J."],
                    "authIdLastNameFirstName_fs": ["0_FacetSep_Lefebvre J."],
                    "authFullNameIdFormPerson_fs": ["J. Lefebvre_FacetSep_159685-0"],
                    "authAlphaLastNameFirstNameId_fs": [
                        "L_AlphaSep_Lefebvre J._FacetSep_0"
                    ],
                    "authIdFullName_fs": ["0_FacetSep_J. Lefebvre"],
                    "authFullNameId_fs": ["J. Lefebvre_FacetSep_0"],
                    "authQuality_s": ["aut"],
                    "authFullNameFormIDPersonIDIDHal_fs": [
                        "J. Lefebvre_FacetSep_159685-0_FacetSep_"
                    ],
                    "authFullNamePersonIDIDHal_fs": ["J. Lefebvre_FacetSep_0_FacetSep_"],
                    "authIdHalFullName_fs": ["_FacetSep_J. Lefebvre"],
                    "authFullNameIdHal_fs": ["J. Lefebvre_FacetSep_"],
                    "authAlphaLastNameFirstNameIdHal_fs": [
                        "L_AlphaSep_Lefebvre J._FacetSep_"
                    ],
                    "authLastNameFirstNameIdHalPersonid_fs": [
                        "Lefebvre J._FacetSep__FacetSep_0"
                    ],
                    "contributorId_i": 888556,
                    "contributorFullName_s": "Migration ProdInra",
                    "contributorIdFullName_fs": "888556_FacetSep_Migration ProdInra",
                    "contributorFullNameId_fs": "Migration ProdInra_FacetSep_888556",
                    "country_s": "INC",
                    "language_s": ["fr"],
                    "halId_s": "hal-02780955",
                    "uri_s": "https://hal.inrae.fr/hal-02780955",
                    "version_i": 1,
                    "status_i": 11,
                    "instance_s": "inrae",
                    "sid_i": 7808,
                    "submitType_s": "notice",
                    "docType_s": "COMM",
                    "oldDocType_s": "COMM",
                    "selfArchiving_bool": False,
                    "city_s": "Paris",
                    "publicationLocation_s": ["Paris"],
                    "publisher_s": ["INA-PG"],
                    "serie_s": [
                        "Informatique et Biosphere, Xeme colloque, Micro informatique et biosphere"
                    ],
                    "inPress_bool": False,
                    "prodinraId_s": ["8214"],
                    "modifiedDate_tdate": "2020-06-12T10:43:26Z",
                    "modifiedDate_s": "2020-06-12 10:43:26",
                    "modifiedDateY_i": 2020,
                    "modifiedDateM_i": 6,
                    "modifiedDateD_i": 12,
                    "submittedDate_tdate": "2020-06-04T17:57:51Z",
                    "submittedDate_s": "2020-06-04 17:57:51",
                    "submittedDateY_i": 2020,
                    "submittedDateM_i": 6,
                    "submittedDateD_i": 4,
                    "releasedDate_tdate": "2020-06-04T17:57:51Z",
                    "releasedDate_s": "2020-06-04 17:57:51",
                    "releasedDateY_i": 2020,
                    "releasedDateM_i": 6,
                    "releasedDateD_i": 4,
                    "producedDate_tdate": "1982-01-01T00:00:00Z",
                    "producedDate_s": "1982",
                    "producedDateY_i": 1982,
                    "publicationDate_tdate": "1982-01-01T00:00:00Z",
                    "publicationDate_s": "1982",
                    "publicationDateY_i": 1982,
                    "owners_i": [888556],
                    "collId_i": [7807],
                    "collName_s": [
                        "Institut National de Recherche en Agriculture, Alimentation et Environnement"
                    ],
                    "collCode_s": ["INRAE"],
                    "collCategory_s": ["INSTITUTION"],
                    "collIdName_fs": [
                        "7807_FacetSep_Institut National de Recherche en Agriculture, Alimentation et Environnement"
                    ],
                    "collNameId_fs": [
                        "Institut National de Recherche en Agriculture, Alimentation et Environnement_FacetSep_7807"
                    ],
                    "collCodeName_fs": [
                        "INRAE_FacetSep_Institut National de Recherche en Agriculture, Alimentation et Environnement"
                    ],
                    "collCategoryCodeName_fs": [
                        "INSTITUTION_JoinSep_INRAE_FacetSep_Institut National de Recherche en Agriculture, Alimentation et Environnement"
                    ],
                    "collNameCode_fs": [
                        "Institut National de Recherche en Agriculture, Alimentation et Environnement_FacetSep_INRAE"
                    ],
                    "_version_": 1751518864358768640,
                    "dateLastIndexed_tdate": "2022-12-07T02:02:35.646Z",
                    "label_xml": '<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hal="http://hal.archives-ouvertes.fr/" version="1.1" xsi:schemaLocation="http://www.tei-c.org/ns/1.0 http://api.archives-ouvertes.fr/documents/aofr-sword.xsd">  <teiHeader>    <fileDesc>      <titleStmt>        <title>HAL TEI export of hal-02780955</title>      </titleStmt>      <publicationStmt>        <distributor>CCSD</distributor>        <availability status="restricted">          <licence target="http://creativecommons.org/licenses/by/4.0/">Distributed under a Creative Commons Attribution 4.0 International License</licence>        </availability>        <date when="2022-12-07T03:02:35+01:00"/>      </publicationStmt>      <sourceDesc>        <p part="N">HAL API platform</p>      </sourceDesc>    </fileDesc>  </teiHeader>  <text>    <body>      <listBibl>        <biblFull>          <titleStmt>            <title xml:lang="fr">Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations.</title>            <author role="aut">              <persName>                <forename type="first">J.</forename>                <surname>Lefebvre</surname>              </persName>              <idno type="halauthorid">159685-0</idno>            </author>            <editor role="depositor">              <persName>                <forename>Migration</forename>                <surname>ProdInra</surname>              </persName>              <email type="md5">9e5ed7d56b52b3d4383123b1ffb42236</email>              <email type="domain">inrae.fr</email>            </editor>          </titleStmt>          <editionStmt>            <edition n="v1" type="current">              <date type="whenSubmitted">2020-06-04 17:57:51</date>              <date type="whenModified">2020-06-12 10:43:26</date>              <date type="whenReleased">2020-06-04 17:57:51</date>              <date type="whenProduced">1982</date>              <fs>                <f name="inra_publicVise_local" notation="string">                  <string>inra_publicVise_local_AU</string>                </f>              </fs>            </edition>            <respStmt>              <resp>contributor</resp>              <name key="888556">                <persName>                  <forename>Migration</forename>                  <surname>ProdInra</surname>                </persName>                <email type="md5">9e5ed7d56b52b3d4383123b1ffb42236</email>                <email type="domain">inrae.fr</email>              </name>            </respStmt>          </editionStmt>          <publicationStmt>            <distributor>CCSD</distributor>            <idno type="halId">hal-02780955</idno>            <idno type="halUri">https://hal.inrae.fr/hal-02780955</idno>            <idno type="halBibtex">lefebvre:hal-02780955</idno>            <idno type="halRefHtml">&lt;i&gt;Informatique et Biosphere, Xeme colloque&lt;/i&gt;, 1982, Paris</idno>            <idno type="halRef">Informatique et Biosphere, Xeme colloque, 1982, Paris</idno>          </publicationStmt>          <seriesStmt>            <idno type="stamp" n="INRAE">Institut National de Recherche en Agriculture, Alimentation et Environnement</idno>          </seriesStmt>          <notesStmt>            <note type="audience" n="2">International</note>            <note type="invited" n="1">Yes</note>            <note type="popular" n="0">No</note>            <note type="peer" n="0">No</note>          </notesStmt>          <sourceDesc>            <biblStruct>              <analytic>                <title xml:lang="fr">Programme sur micro ordinateur de calcul de parente et de consanguinite sur 16 generations.</title>                <author role="aut">                  <persName>                    <forename type="first">J.</forename>                    <surname>Lefebvre</surname>                  </persName>                  <idno type="halauthorid">159685-0</idno>                </author>              </analytic>              <monogr>                <meeting>                  <title>Informatique et Biosphere, Xeme colloque</title>                  <date type="start">1982</date>                  <date type="end">1982</date>                  <settlement>Paris</settlement>                  <country key="INC"/>                </meeting>                <imprint>                  <publisher>INA-PG</publisher>                  <pubPlace>Paris</pubPlace>                  <biblScope unit="serie">Informatique et Biosphere, Xeme colloque, Micro informatique et biosphere</biblScope>                  <date type="datePub">1982</date>                </imprint>              </monogr>              <idno type="prodinra">8214</idno>            </biblStruct>          </sourceDesc>          <profileDesc>            <langUsage>              <language ident="fr">French</language>            </langUsage>            <textClass>              <classCode scheme="halDomain" n="sdv">Life Sciences [q-bio]</classCode>              <classCode scheme="halTypology" n="COMM">Conference papers</classCode>              <classCode scheme="halOldTypology" n="COMM">Conference papers</classCode>              <classCode scheme="halTreeTypology" n="COMM">Conference papers</classCode>            </textClass>          </profileDesc>        </biblFull>      </listBibl>    </body>    <back>      <listOrg type="structures"/>    </back>  </text></TEI>',
                }
            ],
        }
    }