Skip to content

[Azure Blob Storage] Gunzip data that has magic header #526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shigeya-dd
Copy link

What does this PR do?

Gunzip data of blobContent from Azure Blob storage when it has magic header 0x1f 0x8b.

Motivation

Currently gzip files in Blob storage are transferred to Datadog as they are and result in garbled logs.
This PR gunzip those files before sending.

Testing Guidelines

Upload the gzipped files to the blob storage and confirm in Datadog Logs whether they are gunzipped.

Additional Notes

Magic header idea is taken from the following

# Decompress data that has a .gz extension or magic header http://www.onicos.com/staff/iz/formats/gzip.html
if key[-3:] == ".gz" or data[:2] == b"\x1f\x8b":
with gzip.GzipFile(fileobj=BytesIO(data)) as decompress_stream:
# Reading line by line avoid a bug where gzip would take a very long time (>5min) for
# file around 60MB gzipped
data = b"".join(BufferedReader(decompress_stream))

Types of changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)
  • This PR passes the unit tests
  • This PR passes the installation tests (ask a Datadog member to run the tests)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant