pythonpyrogram

pyrogram: filenames of documents or videos files from topic of a supergroup is not printed


I try to extract filenames from a topic of supergroup. I try in this way

from pyrogram import Client

app = Client(
    name="@Peter_LongX",
    api_id=27*******,
    api_hash="b5*******************",
    phone_number="+393*******",
    password="********" or None
)

group_id = -1001867911973
topic_id = 692
msg_file_dict = {}

with app:
    for message in app.get_chat_history(group_id, limit=5):
        print(f"Message Link: {message.link}")
        print(f"Message ID: {message.id}")
        
        if message.link and str(topic_id) in message.link:
            print("Topic ID found in message link")
            
            if message.video or message.document.mime_type.startswith("video"):
                print("Video or video document found")
                
                msg_id = message.id
                file_name = message.video.file_name or f"VID_{message.id}_{message.video.file_unique_id}.{message.video.mime_type.split('/')[-1]}"
                msg_file_dict[msg_id] = file_name

print(msg_file_dict.keys()) # List of Message ID
print(msg_file_dict.values()) # List of File Name

ERRORS

  1. It show me the last message links / message ids of entire supergroup and not the last message links / ids of that specific topic. In my context I set up to extract filenames and message links/ids from topic that is 692 - this because from web telegram I see this url : https://web.telegram.org/a/#-1867911973_692 - however it doesn't print the names of the documents, even if they belong to the supergroup

  2. I have a topic with documents, rar files or videos but it doesn't print the filenames of messages links that have documents

Any idea to solve ?

From console I see something like that

PS C:\Users\Peter\Desktop\script\messagge_id_telegram> python getmsg.py
Message Link: https://t.me/lasoff...../223390
Message ID: 223390
Message Link: https://t.me/lasoff...../223389
Message ID: 223389

but I expect something like this

PS C:\Users\Peter\Desktop\script\messagge_id_telegram> python getmsg.py
enter code here
Message Link: https://t.me/lasoff...../223390
Message ID: 223390

Filename: Fondazione_1x10.mp4
Message Link: https://t.me/lasoff...../223389
Message ID: 223389

For example Message Link: https://t.me/lasoff...../223390 should be it might not have a name because maybe it's a sticker or an emojii

Additional question: Do you have any idea how you could print the results from a specific range of dates, for example from July 1st to August 1st 2023?

EDIT: as quamrana suggested I tried to get the filenames printed, but still nothing changes, they are not returned when these are there. To do this I change this part of code

    if message.link and str(topic_id) in message.link:
        print("Topic ID found in message link")

with this

if message.link and str(topic_id) in message.link and (message.video or message.document.mime_type.startswith("video")):
    print("Topic ID found in message link")
    print("Video or video document found")
    msg_id = message.id
    file_name = message.video.file_name or f"VID_{message.id}_{message.video.file_unique_id}.{message.video.mime_type.split('/')[-1]}"
    msg_file_dict[msg_id] = file_name

Solution

  • The message links returned by get_chat_history() do not actually contain the topic ID - they follow a format like https://t.me/lasoff...../223390. ↗ Telegram constructs these links based on the chat ID and message ID only.

    Solution to error 1:

    To get messages for a specific topic, we need to use get_discussion_replies(). This method allows passing the ID of the topic starter message, and will return only messages in that topic thread.

    Solution to error 2:

    you have a 'if condition' in your code that never satisfies.

    if message.link and str(topic_id) in message.link:
    

    the str(topic_id) in message.link part is never True, because your links don't contain topic_id in the link like you see on the web version url.

    the corrected code will be like this:

    group_id = -1001867911973
    topic_id = 692
    msg_file_dict = {}
    
    async def main():
        async with app:
            async for message in app.get_discussion_replies(chat_id=group_id,  message_id=topic_id):
                print(f"Message ID: {message.id}")
    
                if message.video or (message.document and message.document.mime_type.startswith("video")):
                    file = message.video or message.document
                    print("Video or video document found")
    
                    msg_id = message.id
                    file_name = file.file_name or f"VID_{message.id}_{file.file_unique_id}.{file.mime_type.split('/')[-1]}"
                    print(file_name)
                    msg_file_dict[msg_id] = file_name
                print()
    
    app.run(main())
    print(msg_file_dict.keys())  # List of Message ID
    print(msg_file_dict.values())  # List of File Name