Feature Request: add BDfR as a new extractor for archiving Reddit content

### Discussed in https://github.com/ArchiveBox/ArchiveBox/discussions/754

<div type='discussions-op-text'>

<sup>Originally posted by **BlipRanger** May 24, 2021</sup>
Just wanted to make a quick mention of [BDfR](https://github.com/aliparlakci/bulk-downloader-for-reddit) as a cool project that might make for a good starting point for the unrolling of reddit comments/posts as mentioned in the roadmap. They currently support grabbing a variety of media types from the post as well as the comments/text in a separate (json) file. I've been working on an [addon](https://github.com/BlipRanger/bdfr-html) for it lately and I think it's a pretty great project with well-maintained code. If nothing else, they have really good examples of working with reddit data which could be useful! Just wanted to bring that to your attention!</div>

I'd love to add [BDfR](https://github.com/aliparlakci/bulk-downloader-for-reddit) as an extractor for Reddit content (and something similar for Twitter too https://github.com/ArchiveBox/ArchiveBox/issues/345) but am somewhat swamped with work and travel for the near future.

If you @BlipRanger or anyone else wants to add it as an extractor (matching the style of our other extractors, e.g. [`archivebox/extractors/media.py`](https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/extractors/media.py) is a great example to copy), I'd be happy to review PRs!

We have some good instructions for contributing a new extractor and getting started with ArchiveBox development in general:
- https://github.com/ArchiveBox/ArchiveBox/blob/dev/README.md#contributing-a-new-extractor
- https://github.com/ArchiveBox/ArchiveBox/blob/dev/README.md#archivebox-development
- https://github.com/ArchiveBox/ArchiveBox/blob/dev/.github/CONTRIBUTING.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: add BDfR as a new extractor for archiving Reddit content #778

Discussed in #754

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: add BDfR as a new extractor for archiving Reddit content #778

Description

Discussed in #754

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions