Rendered at 07:49:21 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
LatencyKills 6 days ago [-]
Ex-Apple engineer here. This is, for better or worse, just the way Apple approaches this type of problem. From Apple's perspective, this is the way to preserve Finder / Gatekeeper / metadata semantics. It avoids silent data loss when round-tripping archives between Macs. This behavior also maintains consistency with copyfile(3) (as well as the Archive Utility behavior).
Apple treats tar less like “portable Unix interchange” and more like “archive this filesystem object faithfully.” That is very Apple, and very libarchive. ;-)
This is probably going to get worse (as Apple continues to add macOS-specific metadata), so your workaround is very helpful.
I haven't tested it in a while, but at one point, setting the COPYFILE_DISABLE=1 env variable would disable the inclusion of macOS-specific metadata.
Terretta 3 days ago [-]
Arguably, principle of least surprise is very Apple.
If I point "tape archive" at a file system, I want that file system archived to tape. And so, tar does.
If I don't, well, that's a fine option, and there's a fine option for that.
So it's less of a "workaround" or something that "gets worse", than, "No, I don't really want a tape archive of this filesystem, only of some of it." And that's supported.
That said, never seeing another .DS_Store should be a system-wide option!
JoshTriplett 2 days ago [-]
> Arguably, principle of least surprise is very Apple.
Principle of least surprise is good engineering practice. The question is always whose surprise. Someone who expects tar to behave like other UNIX systems is going to be surprised by this. Someone who expects tar on Apple to have perfect fidelity would be surprised by not-this.
I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies. You can't have this problem if your packaging system pulls in a specific portable `tar` library.
adrian_b 2 days ago [-]
What should be really surprising for the users of UNIX-like operating systems is when they lose data because traditional UNIX utilities like cp, tar or cpio do not make complete copies of files, as one would expect from their description.
What is worse is that these utilities do not give any warnings when they do not make complete copies. For cp, the root cause is that it has bad default options, while for tar and cpio the standard file formats cannot store the metadata of modern file systems.
The various tar programs have their own different file format extensions to deal with modern file systems, which are guaranteed to work only when using the same tar program for both creation and extraction. The better tar programs implement both their own file format extensions and the file format extensions used by other popular tar programs.
The author of the TFA has used some obsolete tar program, which is the cause for the surprising behavior that was seen.
To avoid loss of data on Linux, I always use the PAX file format instead of tar or cpio, with the extensions implemented by "bsdtar --create --format=pax" from libarchive, and I always alias cp to '/bin/cp --no-dereference --recursive --one-file-system --preserve=all --strip-trailing-slashes --verbose --interactive', where cp has been built with extended attributes support.
crazygringo 2 days ago [-]
> The question is always whose surprise.
I think that the surprise of more data than expected is more desirable than the surprise of data loss. So in this case, it seems like the safe choice.
dlenski 2 days ago [-]
Agreed. I usually hate on Apple, and its terribly ancient utilities and gratuitous incompatibility with modern Linux utilities, motivated by hatred of the GPL license.
But in this case, I think what it's doing is… basically fine? "Tar should faithfully reproduce the semantics of the source filesystem" is a perfectly reasonable starting point.
Ideally there would be a documented way to turn off the Apple-specific metadata with Apple's own tar, though.
saagarjha 2 days ago [-]
From tar(1):
--no-mac-metadata
(x mode only) Mac OS X specific. Do not archive or extract ACLs
and extended file attributes using copyfile(3) in AppleDouble
format. This is the reverse of --mac-metadata. and the default
behavior if tar is run as non-root in x mode.
Arainach 2 days ago [-]
Apple is always surprised that non-Apple devices exist.
See: the permanent undismissable red icon to "finish setting up your Apple TV with your iPhone"
simianparrot 2 days ago [-]
Apple can't control non-Apple devices. They can only control their own. So this makes perfect sense.
dwattttt 2 days ago [-]
They could control their own Apple TVs to allow that dialogue to be dismissed via the TV controls.
fingerlocks 2 days ago [-]
Agreed, but why not just finishing setting it up? Or do people own Apple TVs without iPhones? That never occurred to me since a large part of the value prop is phone integration
Arainach 2 days ago [-]
No, the value prop is a streaming device with a clean UX not filled with ads. My phone (which is not an iPhone) has nothing to do with it. Apple TV is a far better YouTube device than Google TV. It's also the best device for Plex, Netflix, and all the streaming apps.
dwattttt 2 days ago [-]
Yes, I believe it's possible to buy an Apple TV without owning an iPhone.
jooize 2 days ago [-]
What integrations do you use? I can't really think of what I would miss on the Apple TV if I switched from iPhone. I rarely use AirPlay, disable Photos for in-house privacy reasons, and… oh yeah, the remote control for keyboard, volume, and navigation via iPhone is neat! I think the Apple TV is just a strong product on its own.
fingerlocks 23 hours ago [-]
I use screen mirroring, a lot. Guess I’m in the minority around here. Really nice projecting your phone on a massive OLED to multitask on the phone. Or even pair programming and conference calls you can mirror the phone to TV for the call while coding on the laptop.
I use my Apple TV like it’s a big iPad stuck to the wall. Because that’s basically what it is. I honestly had no idea so many people just buy it to stream the same content on every other platform
Someone 2 days ago [-]
> Someone who expects tar to behave like other UNIX systems is going to be surprised by this
Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks”
And yes, that same page also says:
“You can create an archive on one system, transfer it to another system, and extract the contents there. This allows you to transport a group of files from one system to another.”
> You can't have this problem if your packaging system pulls in a specific portable `tar` library.
You can’t pull in specific portable stuff all the way down (not even when running in Docker or a VM), so that will decrease the risk, but it cannot completely remove it. As an example, I think GNU tar will happily include .DS_Store files in archives.
Joker_vD 2 days ago [-]
> I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies.
Well, you see, while this, frankly, applies not just to build systems but to most of software, the consensus in the community of distro-maintainers is that it's actually wrong: you should use your system's package manager, and tools it can install, and let it fiddle with the ambient environment and give you that delicious "path dependency". And if your distro's packaging environment doesn't allow to do the things you need (e.g. being able to install both mongodb 3.8 and mongodb 5.0, ideally at the same time, but okay, I can keep running apt remove/install over and over, but I do need to check if my app correctly handled the wire protocol changes), well, that's your problem for desiring strange things.
amarant 2 days ago [-]
Nixos has a pretty solid solution to this issue: key your dependencies with checksums of the content. That way you get the best of both worlds: you always get the exact version you want, and you can share a copy of that exact version with other software that wants to use that exact version too!
JoshTriplett 2 days ago [-]
Yeah, Nix-like distributions (e.g. guix, lix) do for Linux systems what some language package managers (e.g. cargo) do for individual projects.
dented42 2 days ago [-]
So it sounds like you don’t get the exact version you want because metadata is thrown away.
amarant 2 days ago [-]
Curious, what is your software doing that it depends on specific metadata in your dependencies? What metadata do you require? Most files metadata is stuff like created timestamp, last edit timestamp, read/write/execute permissions..
I'm just trying to think of a case where metadata would be relevant in a dependency?
rrvsh 2 days ago [-]
It's a checksum not the content itself
altairprime 2 days ago [-]
Are the xattr / chattr / umask checksums rolled into the main data fork content or are they hashed separately (or not at all)?
a_t48 2 days ago [-]
IIRC Nix is checksummed in the hash of the source of the content, not the results.
microtonal 2 days ago [-]
Hash of a normalization of the derivation, so this roughly means source, dependencies and the ‘build recipe’. The exception are fixed-output derivations, which are typically content-hashed.
That said, a lot of work is done in content-addressed hashing, but AFAIK it’s not the default yet.
adrian_b 2 days ago [-]
> I want that file system archived to tape. And so, tar does.
The traditional UNIX tar and cpio utilities cannot archive the modern Linux file systems without loss of metadata.
Most modern tar programs implement various file format extensions as a workaround for this, but the extensions may be incompatible between distinct tar programs and frequently they are very poorly documented.
Some years in the past, libarchive was the only archiver available on Linux that guaranteed lossless backups for the Linux file systems, e.g. xfs or ext4 (and also lossless file transfers between Linux file systems and FreeBSD file systems). Therefore that is what I have been using on Linux since then.
Presumably since then GNU tar and other tar programs should have caught up with it, but I have not verified this.
Whichever tar program was used in TFA, it was an obsolete tar program, and that was the real problem, not that the archives had been created on an Apple computer.
saghm 2 days ago [-]
If you think that most people who run the tar command are assuming it will work like a tape archive, you'll probably be the one surprised
taftster 2 days ago [-]
> That said, never seeing another .DS_Store should be a system-wide option!
Yes please.
ryandrake 2 days ago [-]
.DS_Store, .fseventsd, .Spotlight-V100, .Trashes, and ._this and ._that
These can all die in a fire too, as far as I am concerned. macOS loves to treat the user's filesystem as its own personal garbage dump.
gerdesj 2 days ago [-]
thumbs.db and those weird MS alternative stream files for recording origination.
filesystem attributes are for decorating files with meaning. Anything else that attempts to use filesystems in "interesting" ways is silly.
Apple and MS really ought to consider why they do this sort of fragile, idiosyncratic nonsense.
Joker_vD 2 days ago [-]
But... thumbs.db is precisely not an "attempt to use filesystems in "interesting" ways" — it's literally a just hidden file with previews stored in it. Storing the preview in the alternative stream of the file with the picture itself would be "an interesting way".
kstrauser 2 days ago [-]
Agreed. Where else would you put that stuff? It’s gotta go somewhere, and this is the least surprising place IMO. Anywhere else would have to be a parallel store that follows filesystem mounts and unmounts, renaming directories, etc so that it alway perfectly mirrors the thing it’s configuring.
noisem4ker 2 days ago [-]
> Where else would you put that stuff?
A "Centralized thumbnail cache" in the user profile folder, where it's been for a long while.
That windows takes an approach does not mean it’s a good idea.
And what about things like folder settings, such as whether to display is as a list or as icons, or how to sort it, etc? That’s more important than a thumbnail cache.
gerdesj 1 days ago [-]
Put it in $ProgramData for system-wide usage or whatever the user version is for individuals.
A hidden file is exactly what I said initially - a daft local decoration. Instead of using a stream, this one uses an attribute instead.
Put your data where it makes sense on the filesystem but don't dump arbitrary databases of information on there utilizing filesystem attributes because that is incredibly fragile.
thumbs.db only makes sense if the client is Windows (and only from a particular version onwards, until it doesn't). In the real world (starting with my laptop, running Ubuntu) it does not make any sense at all and is just a pain.
I don't want to see your thumbs.db or your weird ~{temp office files} either. Why do you insist on crapping on my nice neat file system?
Joker_vD 18 hours ago [-]
> Put your data where it makes sense on the filesystem
Right near the data it's derived from is the most obvious place, you know, and makes sense for most of the application developers (it may not "make sense" for you but so what).
> Why do you insist on crapping on my nice neat file system?
"Your" neat file system? What a quaint notion. Two thirds of the hierarchy inside of your $HOME belongs to the OS you use and the tools you use (not "your OS" and "your tools" — just because you use something doesn't make it yours, you know). Your data is yours, of course, but the disk space belongs to the system harness first, and to you second, and the same applies to the file and directory organization.
Or at least that seems to be the prevailing attitude of most of the software.
> thumbs.db only makes sense if the client is Windows (and only from a particular version onwards, until it doesn't). In the real world (starting with my laptop, running Ubuntu) it does not make any sense at all and is just a pain.
Wait, didn't Nautilus use to read thumbs.db if it was present in the folder? Or am I thinking of some other file manager?
mook 2 days ago [-]
In the particular case of thumbs.db, storing them in NTFS alternate data streams would have been a good idea; they're essentially caches for the main data stream, so if they fail to copy to different filesystems it's totally fine. Of course, that wasn't viable because 1) IIRC that was before the widespread adoption of NTFS, and 2) they probably still need the cache somewhere for vFAT USB drives.
fingerlocks 2 days ago [-]
And .DS_Store is just your folder level preferences in Finder. If you don’t use Finder they won’t be created
taftster 1 days ago [-]
Yes. And truthfully, I try to remember to only ever navigate my project folders (particularly those under revision control) using command line and/or IDE folder views.
But eventually, for whatever reason, I use Finder to go looking into a directory structure and bam, now I have .DS_Store. gitignore takes care of it, I know, but still, it's annoying.
noisem4ker 2 days ago [-]
> Thumbs.db
Windows has been storing thumbnail cache in the user profile folder since Vista (2006).
It's been 20 years. Time to let it go.
emmelaich 2 days ago [-]
OTOH, If you want the information contained in those files, where else would you save it?
ajxs 2 days ago [-]
To me it seems more sensible to store information relevant only to this OS in a specific cache somewhere within that OS. It would even make cache-like functionality such as evicting old entries super easy.
Gigachad 2 days ago [-]
There are some tradeoffs. Like if you used a usb and set up folder colours or any of the other things stored in the file, they would not move along with the usb when used on another computer.
ajxs 2 days ago [-]
If I set a folder colour in Finder on my work MacBook, and then plug that USB drive into my personal computer which uses Thunar as a file browser on Debian, nothing would happen.
Someone 2 days ago [-]
And? If you mount a Unix file system on another system, you may see ‘invisible’ fuels whose name starts with a period, may even see weird files named “.” or “..”, may not see ACLs, and may not see any file attributes such as user and group information.
In 1970 it already was not true that one could treat all filesystems the way Unix did, but it certainly isn’t true anymore today.
Someone 2 days ago [-]
> sensible to store information relevant only to this OS in a specific cache somewhere within that OS.
For most of these files, this isn’t information that can be reconstructed, so caching isn’t an option.
Also, the information has to move with the disk, if it is moved to or mounted on another system.
matheusmoreira 2 days ago [-]
It's a good attitude to have, in my opinion. Portability is overrated. Linux developers should be doing a lot more of this. We should be making everything work better for us without caring how it's going to impact other irrelevant platforms. Let the people who actually care about those platforms worry about such things.
cozzyd 2 days ago [-]
It would at least be nice if there was a way to keep apple users from shitting all over the filesystem with remote mounts and ds_store files. Perhaps by automatically unmounting if one is detected.
Maskawanian 2 days ago [-]
At least with Samba you can use the "veto files" and "delete veto files" global directives to deal with those, I personally use the following for veto files:
I understand that I may loose resource forks, but that isn't a problem for the use case of my server.
cozzyd 2 days ago [-]
unfortunately this is mostly people ssh mounting.
I think I can probably write a ebpf rule to avoid writing them though. Or disconnect their sessions. Or modify the .DS_Store to change the finder background to something amusing.
At least if you're using ZFS as the backing store and Samba, you can set vfs objects = catia fruit streams_xattr and similar config options to use extended attributes.
messe 2 days ago [-]
> Linux developers should be doing a lot more of this. We should be making everything work better for us without caring how it's going to impact other irrelevant platforms
Linux developers already do. Using a BSD can already be a pain in the arse, thanks to (often poorly thought out) Linux-isms cropping up everywhere.
pjmlp 2 days ago [-]
Many have a tendency to mix GNU/Linux with UNIX, unfortunely.
Which is why I enjoy at least on embedded we are having plenty of choice between FreeRTOS, NuttX, and plenty others.
Gigachad 2 days ago [-]
Portability of tar archives at least. We should have some like .zip which are standardised and allow some like tar to be faithful replicas of exactly how the OS stores data.
gjadi 2 days ago [-]
Except that zip does not preserve permissions.
Gigachad 2 days ago [-]
That seems fine to me. I’ve never cared about permissions in a zip. Zip these days is primarily for exchanging a directory as a single file to another person. Permissions wouldn’t work across computers anyway.
If you want a faithful archive of the data then a tar archive or disk image is what you want.
adrian_b 2 days ago [-]
Yes, I completely disagree with TFA.
The problem described in TFA is not specific to Apple, but the same problem appears when archiving any decent filesystem that has been designed during the last 3 decades and not a half of century ago, including all Linux file systems.
The problem described in TFA is not caused by Apple, but by the author using an obsolete tar program and not being aware of this.
The traditional tar file format cannot store a lot of the metadata that is contained in modern file systems (e.g. high resolution timestamps, access control lists, extended file attributes), so it is useless for such file systems.
Most modern "tar" implementations have added extensions to the tar file format, to make it usable with modern file systems, such as Linux XFS or Linux EXT4. But many of these extensions are incompatible between themselves, so certain tar files can be fully extracted only with the same tar program that has created them.
I strongly recommend against using the old tar or cpio file formats. Even with various extensions it is not guaranteed that they always work correctly.
I always use only the pax file format, which has also required extensions in order to work with the modern file systems, but the pax extensions are cleaner than those for tar, because the file format is better designed.
Libarchive, which was mentioned in TFA, is available in most Linux distributions or it can be built from source on any Linux computer. It provides an executable that is preferable to tar (better invoked as "bsdtar --format=pax") for the backup or transfer of any Linux files.
I have not checked recently GNU tar or other tar programs available on Linux, and I hope that meanwhile they have been upgraded to be able to archive losslessly the Linux file systems, but some years ago that was not true, so using tar or cpio on Linux could easily corrupt the archived files.
2 days ago [-]
jmclnx 2 days ago [-]
To me, the big question is why Apple needs all these file attribute ? If the files are extracted OK, just ignore the errors :)
bombcar 2 days ago [-]
Apple has had multiple streams per file since the very beginning, and it can store useful and necessary information (the latter is quite rare now, as most things have sane defaults, but losing the extended attributes can lose things that can be annoying).
hamasho 2 days ago [-]
Funnily enough, I got the error message and asked Claude Code, and it replied;
The warning can be suppressed by `--no-xattrs --no-mac-metadata`.
then just edited the code as
- tar czf dist.tar.gz dist
+ COPYFILE_DISABLE=1 tar czf dist.tar.gz dist
mxmlnkn 2 days ago [-]
The title seems misleading.
These are not errors. They are simply warnings about extended attributes being ignored when extracting files, which seems completely fine to me, and creating the tar without those extended attributes has exactly the same outcome, but throws away the metadata at archive time instead of extraction time.
Furthermore, this is not an Apple/macOS issue. The tool used is bsdtar, so it would also affect all BSD-variants that default to bsdtar/libarchive, and those systems also have extended attributes, e.g., for SELinux, which would get added to the TAR.
nomel 1 days ago [-]
It's unfortunate there was never a standardization around how to print different "levels" of messages. Imagine how nice things could be if we had a standard system call for the equivalent of log level, early on!
pier25 2 days ago [-]
I use these settings when creating a tar file for deploy:
tar --no-xattrs --no-mac-metadata -czf
jherskovic 2 days ago [-]
I do this same thing too when building archives in macOS I will unpack on Linux later.
bestony 2 days ago [-]
Thanks for the command! I'll try using it to package next time.
throw0101a 2 days ago [-]
Per this 2018 page, GNU tar seems to work with SCHILY.* encoded xattrs, but not LIBARCHIVE.* ones:
AFAICT, bsdtar will default to "ustar" format, but will auto-switch to "pax" if needed.
Pay08 2 days ago [-]
I wonder how come GNU tar never added them. I have to assume someone has brought the problem to their attention before.
Cockbrand 2 days ago [-]
I don't see the issue here. Some filesystems store data in several streams, and have been doing so for decades. GNU tar gives warnings about stuff it doesn't know, and which can be ignored without a problem. If you haven't ever seen a different *nix system than Linux, this may be a surprise, but it should be taken as an opportunity to learn something new and move on.
nottorp 2 days ago [-]
I don't see errors, just warnings about unknown metadata. It's annoying yes. But they aren't errors.
ninefathom 2 days ago [-]
"For some reason" there are ._ files... now I'm really feeling old, that an apparently-savvy macOS user seems to be unaware of the history of resource forks and AppleDouble. That was fundamental knowledge in the subject of Mac/PC interchange not terribly long ago.
ChiperSoft 2 days ago [-]
I remember when creating a zip file from the finder filled it with a bunch of resource fork files that confused the hell out of anyone opening it on windows.
seba_dos1 2 days ago [-]
> Why does it have those extra files?
> For some reason
Very informative!
ktm5j 2 days ago [-]
They aren't even files.. they're just file metadata called extended attributes. This problem has nothing to do with tar either, it's about differences between the OS filesystems. Really nothing to see here IMO.
albertzeyer 2 days ago [-]
But these are not errors. These are just warnings you can ignore? It's not really so critical?
angry_octet 2 days ago [-]
We might also ask, why doesn't Linux also track such meta-data? Are Linux users not also subject to drive-by downloads impersonating valid files? Should we be one chmod a+x away from compromise?
danielheath 2 days ago [-]
Yes, we should be.
My computer should run programs when I tell it to run them.
Don’t blunt _every_ tool just to make them harder to cut yourself on.
angry_octet 2 days ago [-]
I hope you're in the very small minority of people who rigorously manage untrusted downloads and whitelist every binary, because you're operating an appliance from the 1970s, sticking a metal fork into an un-earthed toaster. Most people need help from their operating system.
b65e8bee43c2ed0 2 days ago [-]
then we, the very small minority, want a button to disable that help.
rtpg 2 days ago [-]
Increased metadata isn't tool blunting in itself though, even if MacOS uses it for being... annoying is one way of saying it.
Provenance information bundled into a file is not the worst idea in the world IMO. We have created/modified timestamps on files already, right? There's definitely the question of "why" but hey if more of my binaries just had at least a tag about who put them there that would be a win in my book.
Not an argument for doing what MacOS does, just an argument that the info would be nice to have.
Joker_vD 2 days ago [-]
I sincerely agree. By the way, thanks for lending your machine for my "Network-Retransmission-and-Compute-as-a-service" network.
danishanish 2 days ago [-]
It’s not blunting a tool, it’s sheathing it. Modern software requires too much proxied trust for this attitude to work.
emmelaich 2 days ago [-]
Tar on linux will. e.g. selinux attrs and other xattrs.
Open question, is it worth attempting to main these semantics between mac and linux.
worthless-trash 2 days ago [-]
No,
I just assume apple will break the behavior when they want to.
worthless-trash 1 days ago [-]
I'm downvoted, but why you cowards.
Why is this assumption incorrect, apple have a long history of breaking away from standards when it doesn't suit them.
bitfilped 2 days ago [-]
Should I be able to run files I download on my own computer? I think yes I should, hate fighting MacOS to do simple tasks because Apple engineers assume the end user has the average intelligence of an ostrich.
shawn_w 2 days ago [-]
That might be an overly optimistic assumption for the typical user, to be fair.
nottorp 2 days ago [-]
> Are Linux users not also subject to drive-by downloads impersonating valid files?
Linux users generally install software with apt or rpm. Or steam.
The existence of any executable file outside the system dirs it a red flag in itself.
chmaynard 2 days ago [-]
Homebrew installs GNU tar as "gtar". On my M4 MacBook:
$ which gtar
gtar is /opt/homebrew/bin/gtar
fastily 2 days ago [-]
Ive installed the gtar formula and aliased it to tar. Cant be bothered to memorize the differences between macOS tar and unix tar, especially when the latter is considered to be the de facto standard
I'll admit that if I don't care about extended attributes (I never really do) I just use zip instead.
chungy 2 days ago [-]
I have bad news for you: Zip supports storing extended attributes as well.
red_admiral 2 days ago [-]
Why switch to a completely different tar and rewire the PATH when you could just set a shell alias? You'll need to edit .bashrc both times but there's no need to install a second tar to /opt to solve this.
firesteelrain 2 days ago [-]
You can either send stderr to /dev/null or use --warning=no-unknown-keyword to suppress them cleanly.
But still interesting nonetheless why they are added
raffraffraff 2 days ago [-]
Would this ever affect me if I don't use many of MacOS built on tools? I brew install gnu equivalents make them all default. Just like how I also don't use most of their desktop environment stuff, and instead use rectangle, hammerspoon, karabiner to make it feel more like the Linux desktop I wish I could use at work.
thomas_viaelo 2 days ago [-]
Mostly yes. If `tar` resolves to gtar in your PATH, your archives won't carry the LIBARCHIVE.* xattrs that GNU tar can't decode, so the warnings go away.
One thing that still trips me up though: `._Foo.txt` AppleDouble files get created in your filesystem any time something Finder-adjacent touches a folder, and gtar archives them just fine, but they show up as garbage on the Linux side. `dot_clean -m mydir/` before tarring kills them, or you can pipe through `--exclude='._*'` if you don't want to touch the source tree.
mg794613 2 days ago [-]
Oh Arul John, just because you don't understand, means it's a error.
What horrible advice also to download different tar versions, for something that should just be explained properly.
If it weren't for the "2024" in the title, I would have thought this to be a result from AI.
But it's not artificial intelligence. It's real stupidity.
wolfi1 2 days ago [-]
fun fact you don't need the hyphen. tar cvzf works also
anthk 2 days ago [-]
How well does pax handle this?
LoganDark 2 days ago [-]
> If you are using a Mac with an Apple Silicon M1, M2, M3 or M4 processor,
Don't forget the MacBook Neo's A18 Pro :)
crest 2 days ago [-]
[dead]
kaiwn 2 days ago [-]
Let me google the name of the author of this blog...
Apple treats tar less like “portable Unix interchange” and more like “archive this filesystem object faithfully.” That is very Apple, and very libarchive. ;-)
This is probably going to get worse (as Apple continues to add macOS-specific metadata), so your workaround is very helpful.
I haven't tested it in a while, but at one point, setting the COPYFILE_DISABLE=1 env variable would disable the inclusion of macOS-specific metadata.
If I point "tape archive" at a file system, I want that file system archived to tape. And so, tar does.
If I don't, well, that's a fine option, and there's a fine option for that.
So it's less of a "workaround" or something that "gets worse", than, "No, I don't really want a tape archive of this filesystem, only of some of it." And that's supported.
That said, never seeing another .DS_Store should be a system-wide option!
Principle of least surprise is good engineering practice. The question is always whose surprise. Someone who expects tar to behave like other UNIX systems is going to be surprised by this. Someone who expects tar on Apple to have perfect fidelity would be surprised by not-this.
I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies. You can't have this problem if your packaging system pulls in a specific portable `tar` library.
What is worse is that these utilities do not give any warnings when they do not make complete copies. For cp, the root cause is that it has bad default options, while for tar and cpio the standard file formats cannot store the metadata of modern file systems.
The various tar programs have their own different file format extensions to deal with modern file systems, which are guaranteed to work only when using the same tar program for both creation and extraction. The better tar programs implement both their own file format extensions and the file format extensions used by other popular tar programs.
The author of the TFA has used some obsolete tar program, which is the cause for the surprising behavior that was seen.
To avoid loss of data on Linux, I always use the PAX file format instead of tar or cpio, with the extensions implemented by "bsdtar --create --format=pax" from libarchive, and I always alias cp to '/bin/cp --no-dereference --recursive --one-file-system --preserve=all --strip-trailing-slashes --verbose --interactive', where cp has been built with extended attributes support.
I think that the surprise of more data than expected is more desirable than the surprise of data loss. So in this case, it seems like the safe choice.
But in this case, I think what it's doing is… basically fine? "Tar should faithfully reproduce the semantics of the source filesystem" is a perfectly reasonable starting point.
Ideally there would be a documented way to turn off the Apple-specific metadata with Apple's own tar, though.
See: the permanent undismissable red icon to "finish setting up your Apple TV with your iPhone"
I use my Apple TV like it’s a big iPad stuck to the wall. Because that’s basically what it is. I honestly had no idea so many people just buy it to stream the same content on every other platform
They shouldn’t. The GNU tar manual already shows this behavior. https://www.gnu.org/software/tar/manual/html_node/What-tar-D...:
Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks”
And yes, that same page also says:
“You can create an archive on one system, transfer it to another system, and extract the contents there. This allows you to transport a group of files from one system to another.”
> You can't have this problem if your packaging system pulls in a specific portable `tar` library.
You can’t pull in specific portable stuff all the way down (not even when running in Docker or a VM), so that will decrease the risk, but it cannot completely remove it. As an example, I think GNU tar will happily include .DS_Store files in archives.
Well, you see, while this, frankly, applies not just to build systems but to most of software, the consensus in the community of distro-maintainers is that it's actually wrong: you should use your system's package manager, and tools it can install, and let it fiddle with the ambient environment and give you that delicious "path dependency". And if your distro's packaging environment doesn't allow to do the things you need (e.g. being able to install both mongodb 3.8 and mongodb 5.0, ideally at the same time, but okay, I can keep running apt remove/install over and over, but I do need to check if my app correctly handled the wire protocol changes), well, that's your problem for desiring strange things.
I'm just trying to think of a case where metadata would be relevant in a dependency?
That said, a lot of work is done in content-addressed hashing, but AFAIK it’s not the default yet.
The traditional UNIX tar and cpio utilities cannot archive the modern Linux file systems without loss of metadata.
Most modern tar programs implement various file format extensions as a workaround for this, but the extensions may be incompatible between distinct tar programs and frequently they are very poorly documented.
Some years in the past, libarchive was the only archiver available on Linux that guaranteed lossless backups for the Linux file systems, e.g. xfs or ext4 (and also lossless file transfers between Linux file systems and FreeBSD file systems). Therefore that is what I have been using on Linux since then.
Presumably since then GNU tar and other tar programs should have caught up with it, but I have not verified this.
Whichever tar program was used in TFA, it was an obsolete tar program, and that was the real problem, not that the archives had been created on an Apple computer.
Yes please.
These can all die in a fire too, as far as I am concerned. macOS loves to treat the user's filesystem as its own personal garbage dump.
filesystem attributes are for decorating files with meaning. Anything else that attempts to use filesystems in "interesting" ways is silly.
Apple and MS really ought to consider why they do this sort of fragile, idiosyncratic nonsense.
A "Centralized thumbnail cache" in the user profile folder, where it's been for a long while.
https://en.wikipedia.org/wiki/Windows_thumbnail_cache
> so that it alway perfectly mirrors
Who cares? It's a cache.
And what about things like folder settings, such as whether to display is as a list or as icons, or how to sort it, etc? That’s more important than a thumbnail cache.
A hidden file is exactly what I said initially - a daft local decoration. Instead of using a stream, this one uses an attribute instead.
Put your data where it makes sense on the filesystem but don't dump arbitrary databases of information on there utilizing filesystem attributes because that is incredibly fragile.
thumbs.db only makes sense if the client is Windows (and only from a particular version onwards, until it doesn't). In the real world (starting with my laptop, running Ubuntu) it does not make any sense at all and is just a pain.
I don't want to see your thumbs.db or your weird ~{temp office files} either. Why do you insist on crapping on my nice neat file system?
Right near the data it's derived from is the most obvious place, you know, and makes sense for most of the application developers (it may not "make sense" for you but so what).
> Why do you insist on crapping on my nice neat file system?
"Your" neat file system? What a quaint notion. Two thirds of the hierarchy inside of your $HOME belongs to the OS you use and the tools you use (not "your OS" and "your tools" — just because you use something doesn't make it yours, you know). Your data is yours, of course, but the disk space belongs to the system harness first, and to you second, and the same applies to the file and directory organization.
Or at least that seems to be the prevailing attitude of most of the software.
> thumbs.db only makes sense if the client is Windows (and only from a particular version onwards, until it doesn't). In the real world (starting with my laptop, running Ubuntu) it does not make any sense at all and is just a pain.
Wait, didn't Nautilus use to read thumbs.db if it was present in the folder? Or am I thinking of some other file manager?
But eventually, for whatever reason, I use Finder to go looking into a directory structure and bam, now I have .DS_Store. gitignore takes care of it, I know, but still, it's annoying.
Windows has been storing thumbnail cache in the user profile folder since Vista (2006).
It's been 20 years. Time to let it go.
In 1970 it already was not true that one could treat all filesystems the way Unix did, but it certainly isn’t true anymore today.
For most of these files, this isn’t information that can be reconstructed, so caching isn’t an option.
Also, the information has to move with the disk, if it is moved to or mounted on another system.
I think I can probably write a ebpf rule to avoid writing them though. Or disconnect their sessions. Or modify the .DS_Store to change the finder background to something amusing.
Linux developers already do. Using a BSD can already be a pain in the arse, thanks to (often poorly thought out) Linux-isms cropping up everywhere.
Which is why I enjoy at least on embedded we are having plenty of choice between FreeRTOS, NuttX, and plenty others.
If you want a faithful archive of the data then a tar archive or disk image is what you want.
The problem described in TFA is not specific to Apple, but the same problem appears when archiving any decent filesystem that has been designed during the last 3 decades and not a half of century ago, including all Linux file systems.
The problem described in TFA is not caused by Apple, but by the author using an obsolete tar program and not being aware of this.
The traditional tar file format cannot store a lot of the metadata that is contained in modern file systems (e.g. high resolution timestamps, access control lists, extended file attributes), so it is useless for such file systems.
Most modern "tar" implementations have added extensions to the tar file format, to make it usable with modern file systems, such as Linux XFS or Linux EXT4. But many of these extensions are incompatible between themselves, so certain tar files can be fully extracted only with the same tar program that has created them.
I strongly recommend against using the old tar or cpio file formats. Even with various extensions it is not guaranteed that they always work correctly.
I always use only the pax file format, which has also required extensions in order to work with the modern file systems, but the pax extensions are cleaner than those for tar, because the file format is better designed.
Libarchive, which was mentioned in TFA, is available in most Linux distributions or it can be built from source on any Linux computer. It provides an executable that is preferable to tar (better invoked as "bsdtar --format=pax") for the backup or transfer of any Linux files.
I have not checked recently GNU tar or other tar programs available on Linux, and I hope that meanwhile they have been upgraded to be able to archive losslessly the Linux file systems, but some years ago that was not true, so using tar or cpio on Linux could easily corrupt the archived files.
These are not errors. They are simply warnings about extended attributes being ignored when extracting files, which seems completely fine to me, and creating the tar without those extended attributes has exactly the same outcome, but throws away the metadata at archive time instead of extraction time.
Furthermore, this is not an Apple/macOS issue. The tool used is bsdtar, so it would also affect all BSD-variants that default to bsdtar/libarchive, and those systems also have extended attributes, e.g., for SELinux, which would get added to the TAR.
* https://mgorny.pl/articles/portability-of-tar-features.html#...
* Via: https://github.com/mxmlnkn/ratarmount/issues/145
bsdtar ≥3.7.2 apparently adds both types to its files for maximum portability:
* https://github.com/libarchive/libarchive/pull/691/files#diff...
AFAICT, bsdtar will default to "ustar" format, but will auto-switch to "pax" if needed.
> For some reason
Very informative!
My computer should run programs when I tell it to run them.
Don’t blunt _every_ tool just to make them harder to cut yourself on.
Provenance information bundled into a file is not the worst idea in the world IMO. We have created/modified timestamps on files already, right? There's definitely the question of "why" but hey if more of my binaries just had at least a tag about who put them there that would be a win in my book.
Not an argument for doing what MacOS does, just an argument that the info would be nice to have.
Open question, is it worth attempting to main these semantics between mac and linux.
I just assume apple will break the behavior when they want to.
Why is this assumption incorrect, apple have a long history of breaking away from standards when it doesn't suit them.
Linux users generally install software with apt or rpm. Or steam.
The existence of any executable file outside the system dirs it a red flag in itself.
It's in the name: GNU's Not Unix.
https://www.opengroup.org/openbrand/register/
But still interesting nonetheless why they are added
One thing that still trips me up though: `._Foo.txt` AppleDouble files get created in your filesystem any time something Finder-adjacent touches a folder, and gtar archives them just fine, but they show up as garbage on the Linux side. `dot_clean -m mydir/` before tarring kills them, or you can pipe through `--exclude='._*'` if you don't want to touch the source tree.
What horrible advice also to download different tar versions, for something that should just be explained properly.
If it weren't for the "2024" in the title, I would have thought this to be a result from AI.
But it's not artificial intelligence. It's real stupidity.
Don't forget the MacBook Neo's A18 Pro :)
Oh...