pythonslackslack-api

Pulling historical channel messages python


I am attempting to create a small dataset by pulling messages/responses from a slack channel I am a part of. I would like to use python to pull the data from the channel however I am having trouble figuring out my api key. I have created an app on slack but I am not sure how to find my api key. I see my client secret, signing secret, and verification token but can't find my api key

Here is a basic example of what I believe I am trying to accomplish:

import slack
sc = slack.SlackClient("api key")
sc.api_call(
  "channels.history",
  channel="C0XXXXXX"
)

I am willing to just download the data manually if that is possible as well. Any help is greatly appreciated.


Solution

  • messages

    See below for is an example code on how to pull messages from a channel in Python.

    threads

    Note that the conversations.history endpoint will not return thread messages. Those have to be retrieved additionaly with one call to conversations.replies for every thread you want to retrieve messages for.

    Threads can be identified in the messages for each channel by checking for the threads_ts property in the message. If it exists there is a thread attached to it. See this page for more details on how threads work.

    IDs

    This script will not replace IDs with names though. If you need that here are some pointers how to implement it:

    Example code

    import os
    import slack
    import json
    from time import sleep
    
    CHANNEL = "C12345678"
    MESSAGES_PER_PAGE = 200
    MAX_MESSAGES = 1000
    
    # init web client
    client = slack.WebClient(token=os.environ['SLACK_TOKEN'])
    
    # get first page
    page = 1
    print("Retrieving page {}".format(page))
    response = client.conversations_history(
        channel=CHANNEL,
        limit=MESSAGES_PER_PAGE,
    )
    assert response["ok"]
    messages_all = response['messages']
    
    # get additional pages if below max message and if they are any
    while len(messages_all) + MESSAGES_PER_PAGE <= MAX_MESSAGES and response['has_more']:
        page += 1
        print("Retrieving page {}".format(page))
        sleep(1)   # need to wait 1 sec before next call due to rate limits
        response = client.conversations_history(
            channel=CHANNEL,
            limit=MESSAGES_PER_PAGE,
            cursor=response['response_metadata']['next_cursor']
        )
        assert response["ok"]
        messages = response['messages']
        messages_all = messages_all + messages
    
    print(
        "Fetched a total of {} messages from channel {}".format(
            len(messages_all),
            CHANNEL
    ))
    
    # write the result to a file
    with open('messages.json', 'w', encoding='utf-8') as f:
      json.dump(
          messages_all, 
          f, 
          sort_keys=True, 
          indent=4, 
          ensure_ascii=False
        )