Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Formatting a 25M-line codebase overnight (stripe.dev)

198 points by r00k 1 days ago | 102 comments

CrzyLngPwd 1 days ago [-]

One of my first jobs was a small software company writing software for a small number of clients, in MS basic PDS.

The lead developer didn't like to bother with formatting code, so I wrote a tool called makenice to format his nasty spaghetti gibberish into something with good indents and layout to make it easier for us normal people to parse.

He was furious, literally spun in circles about it right in the office in front of everyone, so I wrote makenasty to format code into the way he appeared to like.

I only shared makenasty/nice with a couple of the team, who loved it, as it allowed easy conversion between something readable and something the team lead like.

He never knew about makenasty.

munk-a 1 days ago [-]

Outside of the naming - this is a perfectly sane thing to do for developer comfort and can usually be accomplished with simple transformations.

There are often limitations (like manually added indentation/spacing for alignment) but as long as you're very intentional about what changes you'll allow and have a good understanding of the language it can be an extremely safe operation.

e28eta 22 hours ago [-]

I think git’s naming is actually pretty reasonable: smudge (on checkout) & clean (on stage).

munk-a 10 hours ago [-]

Oh smudge and clean are excellent names. My singly held objection to the OP was that they called one of the scripts "makenasty" instead of like "makemunkastyle" or something more neutral. I think it's an excellent idea I'd just avoid being judgemental in naming. You can consider my deep love of BSD braces super nasty but I'd prefer you didn't label it that way.

nitwit005 1 days ago [-]

If he didn't bother formatting code, it would seem impossible to create a tool that formatted code the way he preferred.

singpolyma3 1 days ago [-]

Sounds like he did format code, and even had opinions on how it should be formatted, but OP disagreed.

jameson 1 days ago [-]

reminds me of rob pike mentioning gofmt's style is "no one's favorite"

infogulch 13 hours ago [-]

The full quote:

> Gofmt's style is no one's favorite, yet gofmt is everyone's favorite - Rob Pike https://go-proverbs.github.io/#:~:text=Gofmt%27s%20style%20i...

The best part about gofmt is there is no discussion about how to format Go code. The style itself is fine, skipping endless hours of pointless debate is priceless.

ErroneousBosh 19 hours ago [-]

Having K&R brackets be a syntactical requirement and everything else is a syntax error is okay with me though.

drob518 14 hours ago [-]

K&R, ride or die.

Terr_ 1 days ago [-]

I find a lot of these conflicts I can't resolve when everybody agrees that the pain of ugly/unnecessary diffs is greater than the pain of minor formatting disagreements.

mistrial9 11 hours ago [-]

its because some people learned to put meaning into different ways to layout dense expressions, or different kinds of comments in difference contexts.

python was "weird" at first to C-tribe, because of the strict layout used to eliminate some of the syntax tokens. These stories come from a time before "order over all" in some factory code base was seen as Universally a virtue of some kind

ethical_source 1 days ago [-]

This kind of passive-aggressive bullshit is exactly what's wrong with tech. People don't decide things: they just passively resist, and authority ends up being a muddle of truncated information flows.

wutwutwat 20 hours ago [-]

If someone sharing an old war story with what I felt was a positive and joking tone triggered you this bad, I feel bad for anyone working with you.

Nothing in tech is worth going through life miserable. Nothing.

brentcrude 15 hours ago [-]

[dead]

munificent 1 days ago [-]

> We chose a Saturday to format the entire codebase to avoid merge conflicts. And while our test suite gave us high confidence we'd gotten everything right, it's always a bit daunting to have a diff so large that GitHub can't render it.

The dart formatter has an internal sanity check. It walks through the unformatted and formatted strings in parallel skipping any whitespace. If any non-whitespace characters don't match, it immediately aborts. This ensures that the only thing the formatter changes is whitespace, and makes it much less spooky to run it blind on a huge codebase.

That sanity check has saved my ass a couple of times when weird bugs crept in, usually around unusual combinations of language features around new syntax.

(Unfortunately, the formatter in the past year has gotten a little more flexible about the kinds of changes it makes, including sometimes moving comments relatively to commas and brackets, so this sanity check skips some punctuation characters too, making it a little less reliable.)

Terr_ 1 days ago [-]

I imagine a fancier version would be to compare the Abstract Syntax Trees.

munificent 9 hours ago [-]

The balancing act is that the fancier your sanity check, the greater the chance of something slipping through its cracks too. Walking too strings in parallel is very simple and hard to get wrong. Traversing an AST and skipping a branch is exactly the kind of easy-to-make bug that the sanity check is designed to catch.

What I'd like to do is something somewhere in the middle where I walk the token stream and check that every token of the input ended up in the output, but I haven't figured out a simple and fast way to do that yet. Performance is particularly tricky because I obviously don't want to burn a bunch of CPU cycles on a sanity check that exists only to catch bugs.

saghm 18 hours ago [-]

I've always thought it would make sense for formatters to be baked into the toolchain so that they can reuse the language's parser (presumably exposed as a library) and then be implemented via parsing to AST and then formatted back out so that they're guaranteed to be correct and normalized. This doesn't seem to be how most formatters work in practice though, although I'm not sure if it's because of performance reasons or a lack of support for the parser being exposed in language toolchains.

kevin_thibedeau 13 hours ago [-]

That is essentially what clang-format is.

saghm 12 hours ago [-]

Good point, I hadn't really thought about it, but the name makes it pretty clear it's using clang's tooling. I only have worked a small amount in C++ in my career years back ago, but I distinctly remember feeling like clang-format was essentially perfect from my perspective, so it's nice to know that my abstract ideals bear out in practice.

caminanteblanco 1 days ago [-]

The only issue is then you're at the mercy of whatever parser your formatter uses to construct the AST

Terr_ 22 hours ago [-]

Well, if any (common, non-hobby) parser is thrown off by the reformatting, then it's probably not a safe reformatting either way.

hyperhello 23 hours ago [-]

Strictly speaking that wouldn’t work, since a1 is different from a 1, for example.

saghm 18 hours ago [-]

I don't think they're trying to say that this is a sufficient test for correctness but a necessary one.

munificent 9 hours ago [-]

Correct. It won't catch 100% of possible bugs, but it will catch most.

The kind of bugs that are easiest to write in a formatter is dropping a bit of syntax on the floor and forgetting to include it in the output, and the sanity check will catch those.

It's also definitely possible to miss some whitespace that's necessary for things like identifier separation, but... <shrug> it's a sanity check, not a proof of correctness.

saghm 2 hours ago [-]

In practice, that's how most software testing works anyhow!

Skeime 16 hours ago [-]

Lots of formatters also unify things like trailing commas, so it would be slightly more involved than this.

munificent 9 hours ago [-]

Yes, the dart formatter does that now too. So the sanity check ignores commas and semicolons, which makes it less robust as a sanity check, unfortunately.

hobofan 1 days ago [-]

I'm surprised they went with a all-at-once reformat. Even when doing it over a weekend this is bound to mess with a lot of open PRs at their scale.

I had to introduce a formatter in a few sizeable codebases in the past (few 100k to few million LOC), and I always did it incrementally via a script that reformatted all files that are not touched in any open PR. The initial run reformatted 95% of all files. Then I ran the script every day for ~two weeks and got up to 99.5% of all files and then manually each time one of the remaining ~dozen PRs that were WIP for longer were merged.

rileymichael 1 days ago [-]

both options have their pros and cons. if you utilize some form of ratcheting[1], you can sneak it in without your team knowing.. but all of your PRs for the foreseeable future will have a ton of reformatting screwing with your git blame. if you do it all at once, someone will have to sort out conflicts, but you can utilize `blame.ignoreRevsFile`[2] so that your history remains useful

[1] https://github.com/diffplug/spotless/tree/main/plugin-gradle...

[2] https://git-scm.com/docs/git-blame#Documentation/git-blame.t...

yorwba 20 hours ago [-]

Even if you spread out reformatting over multiple commits in different PRs, you can still make use of blame.ignoreRevsFile as long as your pull request workflow doesn't enforce squash merges even when somebody took extra care to produce a nice commit history.

hobofan 1 days ago [-]

Yes, that is a good point. This is also why I personally would recommend to let a central person/team handle the reformatting rather than sneaking it into every PR (- see my sibling comment). That way you can be in charge of having a uniform style of commit messages to make the reformat commits easy to identify and create a well kept ignoreRevsFile. I think that provides the best of both worlds.

BobbyTables2 1 days ago [-]

That’s a neat feature, thanks for sharing.

Unfortunately I find that code bases lacking auto formatting are often littered with non functional changes as developers temporarily instrument code, remove it, but leave whitespace changes behind.

In terms of tracking code changes, one really would have to rewrite the entire history with each commit reformatted.

WhyNotHugo 1 days ago [-]

> I'm surprised they went with a all-at-once reformat. Even when doing it over a weekend this is bound to mess with a lot of open PRs at their scale.

Rebasing PRs should be trivial, just rewrite all commits reformatting the files it touches, then rebase, `git checkout --theirs`, run formatter again, and `git rebase --continue`. It's methodical and scriptable, you don't need to manually resolve any conflicts.

skydhash 1 days ago [-]

You can always let the team know so that they can apply the formatter on their PR branch.

hobofan 1 days ago [-]

In the smaller migrations I did I tried that, but some way or another a decent chunk of the people still managed to get stuck in merge/rebase conflicts. I would almost explicitly not recommend giving that advise to the teams.

My rough blueprint for introducing formatter or linter nowadys would be:

- Recorded knowledge share session around how to set up the tools for local use 1-2 weeks before the initial rollout, and outline how the process will take place

- On the day of the initial rollout send out a reminder + the recording again

- Do the initial PR

- Incrementally do the rest of the migration, and subscribe to the PRs that drag out the process

jrajav 1 days ago [-]

This is exactly the remedy to the PR issue. I've "lucked" into owning a Prettier formatting pass at two different places now, and did the same process at each - full pass on master, simple step-by-step process to follow to update any PR by running the format script.

zx8080 1 days ago [-]

It's probably because the author can shit on others (let me guess, a senior principal something engineer).

varun_ch 1 days ago [-]

I’m shocked at the 25M line part! That is a completely unfathomable amount of code for one codebase. I really want to know more about that.

phoyd 1 days ago [-]

I am more shocked by the "overnight" aspect. I tried running clang-format on the Chromium source (68,281 .cc files, 21 million lines according to wc):

$ find chromium-149.0.7826.1/ -name ".cc" -exec cat {} + | wc 21640925 55715244 833460441

And that took less than 6 minutes on a single E5-2696 v3 from 2014:

$ time find chromium-149.0.7826.1/ -name *.cc | parallel -j 16 clang-format $x>/dev/null

real 0m5.666s user 1m13.964s sys 0m13.373s

That’s orders of magnitude faster, especially if we assume they’re not running their workloads on potatoes like mine. Is Ruby’s syntax really that much more complicated than C++, or is this a tooling problem?

deets87 1 days ago [-]

I don't think the post necessarily means it took multiple hours to format the codebase, I think they're probably just saying they worked on it off-hours and landed it while no one was working so that it didn't run into merge conflicts.

christophilus 1 days ago [-]

My guess would be tooling. I think the Ruby formatters are written in Ruby. I’d guess the clang one is written in C.

riffraff 21 hours ago [-]

Nah the article says it's rust and calling into a C library for parsing.

bruckie 1 days ago [-]

Only 25 million? :) Google had billions a decade ago...

https://research.google/pubs/why-google-stores-billions-of-l...

Groxx 1 days ago [-]

iirc they also vendor(ed) many of their dependencies, several layers deep, which still counts for "stores" though it's rather different than "wrote" / "maintains".

bruckie 23 hours ago [-]

Very true. It was still hundreds of millions of lines of first party code a decade ago, and could easily be over a billion at this point.

Groxx 22 hours ago [-]

Yeah, I can definitely believe that Google would break over a billion handwritten. It's a big company that has been around for a long time.

It's still absurd. But believable.

jsnell 1 days ago [-]

Right, where is the rest of the code?

mr_mitm 1 days ago [-]

They're up to 42 million now, as per the article

lukan 1 days ago [-]

That sounds even more insane to me, but I guess most of that code does not really touch financial transactions, otherwise it would be a nightmare being responsible to verify that.

clintonb 1 days ago [-]

Ruby code touches financial transactions. Card payments were migrated to Java when I left in 2022. Non-card payments (e.g., ACH, checks, various wallets) were still processed by Ruby.

PCI-related/vaulting code lived in its own locked-down repo. I think that was a mix of Go and Ruby.

Once you have the foundations in place for account balances and the ledger, processing a payment isn’t that daunting. Those foundations, however, took a lot to build and evolve.

jamesfinlayson 1 days ago [-]

> Once you have the foundations in place for account balances and the ledger, processing a payment isn’t that daunting. Those foundations, however, took a lot to build and evolve.

Pretty much. I've worked at places with PHP payment processing that worked just fine, and at a place with C++ payment processing (and no testers) and it worked just fine. I wasn't around when the systems were first built though so not sure if there were tears along the way.

varun_ch 1 days ago [-]

> migrated to Java

I want to know more about this

deathanatos 1 days ago [-]

My (much smaller than Stripe) company is well over 4.5M at this point, and the graph is very much exponential.

AI has been a huge problem here: the amount of code is just exploding. Quality of the produced code is another matter.

Neywiny 1 days ago [-]

^^^^^^^^^^^^^^^^^^^

I recently wrote a very esoteric Python script. 100 lines of code. No classes, no functions, but yes argparse.

I've tried out the latest open source models on the task. They go bananas. It's like Enterprise fizzbuzz (https://github.com/enterprisequalitycoding/fizzbuzzenterpris...). They love classes and imports and reinventing the wheel. A great way for me to tell trash AI slop code is it'll define a useful constant then 15 lines later do it again with a different name.

They love making code that looks impressive. "Wow look at all the classes and functions. It's so scalable. It's so dynamic. It validates every minutae against multiple schema and solves a problem I never thought about." But it was trash code. One really was 400 lines and it didn't even look like it would work. Can't even imagine what it means for 4.5M moderately good human lines to become what? 27M fluffy filler repeat lines that don't even make sense?

manoDev 1 days ago [-]

The bad part of LLM is it got trained on bad examples because us humans also don't know WTF we're doing.

Neywiny 1 days ago [-]

Yeah maybe I need to do the old "you are a veteran engineer" nonsense. I've had some success telling it to implement everything it suggests and be production ready. I hate when it takes a shortcut and says I'll have to change it. That's kinda the whole point of me not writing the code...

e28eta 21 hours ago [-]

Unless I’m mistaken, it’s a monorepo. So it’s not 25M LoC in a single app, it’s (all?) of their server-side code and shared libraries. There’s also a variety of other languages in use.

16 years and thousands of engineers write a lot of code.

il-b 19 hours ago [-]

Imagine lots and lots of models and stubs generated from swagger, protobuf, sqlc etc.

dgrin91 1 days ago [-]

I don't understand why the felt the need to do a big-bang merge like this. Its a formatter, so the files should be functionally equivalent before and after. Why not just enable it for new files/edit files for a while, then once comfortable apply it to old files in batches? What advantage does the big bang merge give? Seems higher risk for the same reward

fsckboy 1 days ago [-]

could introduce subtle bugs, so doing it all at once while it's on the front of everybody's mind with as much comprehensive review and testing of parts or the whole to everybody's satisfaction. if you don't do it all at once, you'd need to repeat the same amount of testing multiple times.

>files should be functionally equivalent before and after

when you say something like this, the road you are on is paved with good intentions.

riffraff 21 hours ago [-]

You can also introduce subtle bugs in your own feature development, and if you change formatting there it's also at the front of your mind.

I think the main argument for doing a big bang rewrite is that you have a defined before/after, otherwise you're stuck into an endless in-between.

eigenblake 1 days ago [-]

Really reminds me that there's nothing in principle stopping us from storing parse trees and exposing them via something git like so we can avoid even needing to format, let alone also needing to resolve a whole category of merge conflicts based on that formatting. I mean a format is just a theme over your data -- I mean code.

nitwit005 1 days ago [-]

> Given that complexity, the hypothesis was simple: tackle the hardest syntax first and the rest will follow.

Always nice to see. I've seen people fall into the trap of designing for the common case, not realizing most of the code will be to deal with the less common cases.

sgc 24 hours ago [-]

In another field I have heard it called going for the jugular; the vivid description helps get the point across nicely.

If you want to master something, you will have to know the hardest part. So just deal with that first and then everything else is easy, because you are dealing with it as somebody who has already mastered the domain.

burnte 1 days ago [-]

The floating spiral thing is so distracting I spent more time deleting it in Inspector than reading the article. I feel like they hate their readers. Awful.

annaspies 1 days ago [-]

If you set `prefers-reduced-motion: reduce`, it goes away

comrade1234 1 days ago [-]

Man must me nice to have the time to put so much work into tabs.

Pxtl 1 days ago [-]

Clean indenting is about saving time so you don't spend way too long getting lost trying to understand what seems like an insane piece of code until you realize it was a mundane bug hidden by incoherent indentation.

tmaly 1 days ago [-]

How did I know this was going to be a rewrite in Rust?

ryanisnan 23 hours ago [-]

Cool story. The treat at the end was fun as well, thank you!

hokkos 1 days ago [-]

Now it makes me wonder, are those 45M LoC are untyped ?

c3ab8ff137 1 days ago [-]

No, Stripe has its own Ruby typechecker - https://sorbet.org/

m12k 1 days ago [-]

https://brandur.org/nanoglyphs/015-ruby-typing#ruby-typing

hiroto_lemon 11 hours ago [-]

[dead]

failure_arch 1 days ago [-]

[dead]

exsol 1 days ago [-]

[dead]

andrewstuart 1 days ago [-]

[flagged]

mbStavola 1 days ago [-]

Considering that it's been doing so successfully at volume for just over 15 years, I think their language choice was fine.

sixo 1 days ago [-]

This ought to change your mind about Ruby!

skinfaxi 1 days ago [-]

Why is that terrifying?

mikedelago 1 days ago [-]

Some folks don't like shipping

fantasizr 1 days ago [-]

ive yet to see a compelling elitist programming language opinion. especially when used at big successful companies. these companies don't function in spite of their technology choices.

NetOpWibby 1 days ago [-]

The only one that worked on me wasn't even elitist in its framing.

Try TypeScript! It makes your JavaScript better!

That was enough for me.

lstodd 1 days ago [-]

> these companies don't function in spite of their technology choices.

shows you never worked at "big succesful companies".

Jtsummers 1 days ago [-]

It's not particularly terrifying. Some people really just don't like Ruby.

sikozu 1 days ago [-]

The systems have to be written in some kind of programming language, and I think Ruby is a perfectly fine choice.

Imustaskforhelp 1 days ago [-]

Not denying that Ruby is a perfectly fine choice but within the article itself it says that Stripe runs the world's largest Ruby codebase so certainly it might be testing the constraints of the language.

The thing I am interested is that I don't suppose that Stripe always had these many LOC's and so I would be curious to know if at any point as the codebase was increasing, were they looking at other new languages which were coming like golang or rust which was more suited for their work or not and what were there decisions/thinking process to continue using ruby.

clintonb 1 days ago [-]

LOC doesn’t have much to do with the “constraints of the language”.

Stripe has dabbled in Golang. There is also a growing Java monorepo.

throwaway041207 1 days ago [-]

Stripe uses Sorbet which, in my experience, increases LOC.

sunrunner 1 days ago [-]

Things can always be worse. It could be PHP, for example.

burnte 1 days ago [-]

Facebook runs in it, so I think the language itself is probably a fine choice.

Twirrim 1 days ago [-]

It's almost like other factors than language choice are more important :)

msla 1 days ago [-]

If you think that's terrifying, imagine all of the essential code written in COBOL and FORTRAN.

Skippy the Intern, now retired these thirty years...

semiquaver 1 days ago [-]

I’d hardly call Sorbet Ruby :)

benbristow 1 days ago [-]

[dead]

CrzyLngPwd 1 days ago [-]

Surely, it no longer needs to be human-readable, and the era of write-only code is finally upon us with the dawn of AI writing our mealtickets.

Why bother formatting 25m lines of slop, and why is AI wasting tokens on making code look human-readable anyway?

sgc 24 hours ago [-]

Every LLM I have ever asked about this says they perform better when they receive pretty-printed code because it is easier to see structure and priorities. It has been an almost universal recommendation for me, and it makes sense since LLMs are just mimicking human expression.

throawayonthe 16 hours ago [-]

you asked the llm? i'm confused

you do understand it can't "know" how it performs right?

sgc 13 hours ago [-]

You actually think that LLMs are not fed docs on how they work in order to help users interact with them better? Asking an LLM how to use it is based on the reasonable presumption that the company making it will prioritize making it useful for users and work on programming it with its own best practices.

Again, it makes perfect sense as well based on how they are trained in the first place. Look at how they tokenize whitespace and you will see why it's useful. Each number of repeating white spaces gets a unique token (so 2 whitespaces = token1, 3 whitespaces = token2) - so it actually does make a very clear reinforcing hierarchy readily available. And we all know if there is anything an LLM needs, it is reinforcement of important points.

throwatdem12311 1 days ago [-]

What is even the point of formatting code anymore.

voidUpdate 19 hours ago [-]

To make it look nice and readable

throwatdem12311 15 hours ago [-]

For an agent?

voidUpdate 15 hours ago [-]

For you, the person reading and writing the code

throwatdem12311 3 hours ago [-]

People are still reading the code?

throawayonthe 16 hours ago [-]

clean diffs for one

stefantalpalaru 17 hours ago [-]

[dead]

cadamsdotcom 1 days ago [-]

An insight about code is that compared to the scale we operate on data, code as text is tiny. Instantaneous git operations and “run this tool over all the code” are the norm even while we wait for LLMs to stream their tokens to stream back so tool calls can operate on it.

That insight might seem obvious - but if you stay cognizant of it as you work, you can invent some pretty amazing tooling for yourself & your team.

Rendered at 02:43:40 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.