What is FEATHER?
Feather is a portable file format built on Apache Arrow's memory specification, designed for fast reading and writing of data frames between Python (Pandas/Polars) and R. It uses columnar storage with Arrow's in-memory format directly written to disk, enabling zero-copy reads and memory-mapped access. Feather V2 (2020) uses the Arrow IPC format for broader language support. Extremely fast I/O - often 10-100x faster than CSV.
FEATHER is used for temporary data storage in data science workflows, sharing datasets between Python and R, caching intermediate results, and fast data frame serialization. Popular with data scientists using Pandas/Polars for quick saves during exploratory analysis. Not suitable for long-term archival (use Parquet instead) but perfect for fast iteration and language interoperability. Common in Jupyter notebooks for checkpointing work.
History
Wes McKinney (creator of Pandas) and Hadley Wickham (creator of tidyverse) collaborated to create Feather for seamless Python-R data exchange.
Key Milestones
- 2016: Feather V1 announced
- 2017: Pandas and R integration
- 2020: Feather V2 based on Arrow IPC
- 2021: Polars adoption
- 2023: Widespread data science use
- Present: Standard for fast I/O
Key Features
Core Capabilities
- Lightning Fast: 10-100x faster than CSV
- Zero-Copy: Memory-mapped reads
- Columnar: Efficient storage
- Python/R: Native support
- Arrow-Based: Standard memory format
- Compression: Optional LZ4/Zstd
Common Use Cases
Pandas
Fast DataFrame saves
R Integration
Python-R data exchange
Caching
Intermediate results
Notebooks
Jupyter checkpoints
Advantages
- Extremely fast read/write
- Zero-copy memory mapping
- Perfect Python/R compatibility
- Arrow ecosystem integration
- Simple API
- Lightweight format
- Preserves all data types
Disadvantages
- Not for long-term archival
- Limited compression vs Parquet
- Binary format (not human-readable)
- Less ecosystem support than Parquet
- No schema evolution
- Designed for temporary use
Technical Information
Format Specifications
| Specification | Details |
|---|---|
| File Extension | .feather, .arrow |
| MIME Type | application/octet-stream |
| Version | V2 (Arrow IPC format) |
| Storage | Columnar |
| Compression | LZ4, Zstd (optional) |
| Base Format | Apache Arrow memory |
Common Tools
- Python: Pandas, Polars, PyArrow
- R: arrow package, feather package
- Julia: Feather.jl, Arrow.jl
- Processing: Apache Spark (Arrow integration)