For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Blog
DocsAPI Reference
DocsAPI Reference
    • AIStore
    • Documentation
  • Core Documentation
    • In-depth Overview
    • Terminology and core abstractions
    • Getting Started
    • Networking model
    • Buckets: design, operations, namespaces, and system buckets
    • Observability overview
    • CLI overview
    • Production deployment
    • Technical Blog
  • APIs, SDKs, and Compatibility
    • Go API
    • Python SDK
    • PyPI package
    • Python SDK reference guide
    • PyTorch integration
    • TensorFlow integration
    • HTTP API reference
    • curl examples
    • Easy URL
    • S3 compatibility
    • s3cmd quick start
    • Presigned S3 requests
    • Boto3 support
  • Command-Line Interface
    • CLI overview
    • ais help
    • CLI reference guide
    • Bucket operations
    • Cluster and remote-cluster management
    • Storage and mountpath management
    • Monitoring and ais show
    • Downloads
    • Jobs
    • Authentication and access control
    • Configuration via CLI
    • ETL CLI
    • Distributed shuffle CLI
    • ML / get-batch CLI
    • GCP credentials
    • TLS certificate management
  • Storage and Data Management
    • Storage services
    • Buckets: design, operations, namespaces, and system buckets
    • Native Bucket Inventory (NBI)
    • Backend providers
    • On-disk layout
    • Virtual directories
    • System files
    • Evicting remote buckets and cached data
  • Cluster Operations
    • Node lifecycle: maintenance, shutdown, decommission
    • Global rebalance
    • Resilver
    • AIS in Containerized Environments
    • Highly available control plane
    • Information Center (IC)
    • Out-of-band updates
    • Troubleshooting
  • Configuration and Security
    • Configuration
    • Environment variables
    • Feature flags
    • AuthN and access control
    • Authentication validation
    • HTTPS and certificates
    • Switching a cluster to HTTPS
  • ETL and Advanced Workflows
    • ETL overview
    • ETL CLI docs
    • ETL Python SDK examples
    • Custom transformers
    • ETL Python webserver SDK
    • ETL Go webserver package
    • Archives: read, write, and list
    • Distributed shuffle (dsort)
    • Initial sharding utility (ishard)
    • Downloader
    • Blob Downloader
    • Batch object retrieval (get-batch)
    • Batch operations
    • Tools and utilities
    • Extended actions (xactions)
  • Observability, Monitoring, and Performance
    • Observability overview
    • Monitoring with CLI
    • Logs
    • Prometheus integration
    • Metrics reference
    • Grafana dashboards
    • Kubernetes monitoring
    • Distributed tracing
    • Monitoring get-batch
    • AIS load generator (aisloader)
    • Benchmarking AIStore
    • Performance tuning and testing
    • Performance monitoring via CLI
    • Rate limiting
    • Checksumming
    • Filesystem Health Checker (FSHC)
    • Traffic patterns
  • Networking
    • Networking: multi-homing, network separation, IPv6
    • HTTPS configuration
    • Switching to HTTPS
    • Idle connections
    • MessagePack protocol
  • Deployment
    • AIStore on Kubernetes
    • Kubernetes Operator
    • Ansible playbooks
    • Helm charts
    • Deployment monitoring
    • Docker
  • Developer Resources
    • Development guide
    • aisnode command line
    • Build tags
  • Object and Bucket Naming
    • Unicode and special symbols in object and bucket names
    • Extremely long object names
  • Blog Posts
    • Go: append a file to a TAR archive
Blog
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoAIStore
On this page
  • The problem
  • First attempts
  • The solution
  • References
Blog Posts

Go: append a file to a TAR archive

||View as Markdown|
Previous

Extremely long object names

Aug 10, 2021·Vladimir Markelov
golangarchivetar

The problem

AIStore supports a whole gamut of “archival” operations that allow to read, write, and list archives such as .tar, .tgz, and .zip. When we started working on appending content to existing archives, we quickly discovered that, surprisingly, the corresponding open source appears to be missing. Standard Go packages - e.g., archive/tar - fully support creating and reading archives but not appending to an existing one…

Looking for a solution on the Internet did not help - snippets of an open code that we could find did not work or worked only under certain restricted conditions.

In this text, we show how to append a file to an existing TAR. GitHub references are included below.

First attempts

The first idea was to open an archive for appending and write new data at the end. It did not work: a new file was missing in the archive list and the appended file was inaccessible. TAR specification states:

A tar archive consists of a series of 512-byte records. Each file system object requires a header record which stores basic metadata (pathname, owner, permissions, etc.) and zero or more records containing any file data. The end of the archive is indicated by two records consisting entirely of zero bytes.

Every TAR archive ends with an end of archive marker (a trailer): 2 zero blocks at the end. Any information written after the trailer is ignored. It made clear that a header and data of a new file had to overwrite the trailing zero blocks. As the trailer size was 2 records, it seemed sufficient to start writing the new data with 1 KiB offset from the end of the archive. A solution found on the Internet employed this idea:

1const recordSize = 512
2var data []byte
3f, err := os.OpenFile("test.tar", os.O_RDWR, os.ModePerm)
4if err != nil {
5 log.Fatalln(err)
6}
7if _, err = f.Seek(-2 * recordSize, io.SeekEnd); err != nil {
8 log.Fatalln(err)
9}
10tw := tar.NewWriter(f)
11hdr := &tar.Header{
12 Name: "new_file",
13 Size: int64(len(data)),
14}
15if err := tw.WriteHeader(hdr); err != nil {
16 log.Fatalln(err)
17}
18if _, err := tw.Write(data); err != nil {
19 log.Fatalln(err)
20}
21tw.Close()
22f.Close()

But the story did not end here. It worked fine only with TAR’s created with Go standard library. When I tried to append a new file to an archive created with a system tar utility, it failed: the appended file was missing again.

The solution

Digging into the trouble, I discovered that the number of zero blocks in the archive trailer depended on TAR version and defaults. Go package added only 1 KiB of zeros, but the archive created with system tar had more than 4 KiB zeroes at the end. That was why the first way did not work with an arbitrary TAR archive. TAR did not seem to store information about trailer anywhere, so I had to calculate the size of the trailer somehow. My final solution was inefficient for archives with a lot of files, yet it was reliable and it worked with any TAR archive:

  1. Open an archive.
  2. Pass its file handle to a TAR reader.
  3. Iterate through all files inside the archive until io.EOF is reached.
  4. For each file the TAR reader reports the file size, and the file pointer returns the position from which the file starts.
  5. When TAR reader returns io.EOF, the file pointer is already beyond the zero trailer. So we have to use numbers from the previous iteration to calculate the end of archive data.

A tricky thing that the next archive entry must be written from the position aligned to TAR record boundary - 512 bytes. So the file size must be rounded up to the nearest multiple of TAR record size.

1const recordSize = 512
2var data []byte
3fh, err := os.OpenFile("test.tar", os.O_RDWR, os.ModePerm)
4if err != nil {
5 log.Fatalln(err)
6}
7var (
8 lastPos, lastSize int64
9 err error
10)
11twr := tar.NewReader(fh)
12for {
13 st, err := twr.Next()
14 if err != nil {
15 if err == io.EOF {
16 break
17 }
18 log.Fatalln(err)
19 }
20 if lastPos, err = fh.Seek(0, io.SeekCurrent); err != nil {
21 log.Fatalln(err)
22 }
23 lastSize = st.Size
24}
25// Round up the size of the last file to multiple of recordSize
26paddedSize := ((lastSize - 1) / recordSize + 1) * recordSize
27if _, err = fh.Seek(lastPos+paddedSize, io.SeekStart); err != nil {
28 log.Fatalln(err)
29}
30
31tw := tar.NewWriter(f)
32hdr := &tar.Header{
33 Name: "new_file",
34 Size: int64(len(data)),
35}
36if err = tw.WriteHeader(hdr); err != nil {
37 log.Fatalln(err)
38}
39if _, err = tw.Write(data); err != nil {
40 log.Fatalln(err)
41}
42tw.Close()
43fh.Close()

References

For the latest code, please see:

  • The function OpenTarForAppend in “cos” package.
  • Example of how to use OpenTarForAppend in the implementation of the function appendToArch in the core package.