I have an requirement to load hadoop snappy compressed avro file to Big query. Saw from Google docs that big query detects snappy compression. But when I tried bq load --source-format=AVRO project:dataset.table gs://mybucket/inputsnappy.snappy I got "Apache Avro library failed to parse the header with the following error: Invalid data file. Magic does not match" error. Any input on this will really help. Also Google doc says only compression on data blocks can be detected by bigquery. Can some one help me understand that point on data blocks. I also tried converting snappy to avro using python snappy. But iam getting error when doing "decompressed_data= snappy.decompress(input_data)" Error :Uncompress:invalid input file. Not sure how to proceed now.
Big Query parsing error while reading snappy compressed avro file
81 views Asked by Sruthi Chandran At
0
There are 0 answers
Related Questions in PYTHON
- How to store a date/time in sqlite (or something similar to a date)
- Instagrapi recently showing HTTPError and UnknownError
- How to Retrieve Data from an MySQL Database and Display it in a GUI?
- How to create a regular expression to partition a string that terminates in either ": 45" or ",", without the ": "
- Python Geopandas unable to convert latitude longitude to points
- Influence of Unused FFN on Model Accuracy in PyTorch
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- Writes to child subprocess.Popen.stdin don't work from within process group?
- Conda has two different python binarys (python and python3) with the same version for a single environment. Why?
- Problem with add new attribute in table with BOTO3 on python
- Can't install packages in python conda environment
- Setting diagonal of a matrix to zero
- List of numbers converted to list of strings to iterate over it. But receiving TypeError messages
- Basic Python Question: Shortening If Statements
- Python and regex, can't understand why some words are left out of the match
Related Questions in HADOOP
- Can anyoone help me with this problem while trying to install hadoop on ubuntu?
- Hadoop No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster)
- Top-N using Python, MapReduce
- Spark Driver vs MapReduce Driver on YARN
- ERROR: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "maprfs"
- can't write pyspark dataframe to parquet file on windows
- How to optimize writing to a large table in Hive/HDFS using Spark
- Can't replicate block xxx because the block file doesn't exist, or is not accessible
- HDFS too many bad blocks due to "Operation category WRITE is not supported in state standby" - Understanding why datanode can't find Active NameNode
- distcp throws java.io.IOException when copying files
- Hadoop MapReduce WordPairsCount produces inconsistent results
- If my data is not partitioned can that be why I’m getting maxResultSize error for my PySpark job?
- resource manager and nodemanager connectivity issues
- ERROR flume.SinkRunner: Unable to deliver event
- converting varchar(7) to decimal (7,5) in hive
Related Questions in GOOGLE-BIGQUERY
- SQL LAG() function returning 0 for every row despite available previous rows
- Convert C# DateTime.Ticks to Bigquery DateTime Format
- SELECT AS STRUCT/VALUES
- Google Datastream errors on larger MySQL tables
- Can i add new label called looker-context-look_id in BigQuery connection(Looker)
- BigQuery external table using JSON files
- Does Apache Beam's BigQuery IO Support JSON Datatype Fields for Streaming Inserts?
- sample query for review for improvement on big query
- How does Big Query differentiate between a day and month when we upload any CSV or text file?
- How to get max value of a column when ids are unique but they are related through different variables
- how to do a filter from a table where 2 different columns has 2 different records which has same set of key combinations in bigquery?
- How to return a string that has a special character - BigQuery
- How do I merge multiple tables into a new table in BigQuery?
- Customer Churn Calculation
- Is it correct to add "UNNEST" in the "ON" condition of a (left) join?
Related Questions in SNAPPY
- Cannot install snappy with brew - fatal error: 'cassert' file not found
- MacOS pip install numbers-parser fails building snappy, but python snappy already installed
- The process has been signaled with signal "11" on CentOS using snappy laravel
- How to support snappy compression after aiokafka 0.9.0?
- How can I install snappy by choco?
- How to resolve "Module not found: Can't resolve '@napi-rs/snappy-linux-arm-gnueabihf'"?
- Images not appearing in production - Laravel Snappy
- Kafka Mirror Maker snappy compression
- NestJS Kafka microservice decompressing snappy messages to empty string
- Big Query parsing error while reading snappy compressed avro file
- Trouble installing snappy_ext gem
- The exit status code '1' says something went wrong
- How to install a different snappy version in MacOS?
- Snowflake stage gives: SQL compilation error: Option SNAPPY_COMPRESSION is not valid for file format type PARQUET
- How can I optimize orc snappy compression in spark?
Related Questions in FASTAVRO
- Failed to build fastavro
- OverflowError: Python int too large to convert to C int when using confluent_kafka Avro deserializer
- generate sample(synthetic data ) in avro format based on avdl file using python
- ValidationError while validating data against schema FastAvro
- Is there a way to write a headless Avro message to a file without deserializing its binary contents in Python?
- Big Query parsing error while reading snappy compressed avro file
- How to execute fastavro shell command on in only python code?
- Error installing fastavro==1.7.3 on MacOS, Python 3.10
- fastavro.schemaless_reader performance loss when profiling is enabled
- How can I create an Avro schema from a python class?
- How can I auto-generate a pulsar AvroSchema class from an existing model?
- issue on avro file import in Google BigQuery
- What is the best way to upgrade avro files (stored on GCS) having older schemas (containing "default":"null") to newer formats (with "default":null)
- Fastavro fails to parse Avro schema with enum
- Changing schema of avro file when writing to it in append mode
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)