Skip to content

How to maintain an ongoing archive? #154

@allefeld

Description

@allefeld

This question is related to #147 which received a long answer, but after reading it unfortunately I'm still not sure...

My use for your tool is not a "goodbye!" archive, but I'm still using Twitter and will probably continue to do so for a while. I want an archive of my tweets because older tweets are hard to access through the standard interface, sometimes tweets I retweeted or quote tweeted are deleted by their poster, and maybe someday Twitter just disappears suddenly without even the ability to request a last archive. (Who knows – just recently they announced they'll monetize API access.)

Because of my ongoing activity, I'd like to maintain an ongoing archive, i.e. periodically add new tweets but keep those that are already archived (even if they were deleted in the meantime). Less important but also an aspect is that there's no need to download all the old stuff again if it already has been archived, and I'd prefer to avoid getting my IP address blacklisted by Twitter due to excessive requests.

Since the script is processing archives from the official Twitter archive function, there is no way around periodically requesting and downloading new archives. My idea would be that I can unzip a new archive into the old folder, run parser.py again, and then only new stuff is added to parser-output.

My impression from the issue cited above is that at the moment, twitter-archive-parser is not made for that. Things may be getting there, but currently there is no guarantee. That would mean that for the moment I should unzip each new archive into its own folder, and run parser.py on that – and maybe later figure out a way to merge a series of such folders.

Is that correct?

And are there plans to support what I have in mind at some point?

Btw. thanks for your already brilliant tool!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions