pcouy

@pcouy@lemmy.pierre-couy.fr
15 Post – 73 Comments
Joined 1 years ago

Downvoted for cropping out the reference to the original...

3 more...

On this day, exactly 12 years ago (9:30 EDT 1 Aug 2012), was the most expensive software bug ever, in both terms of dollars per second and total lost. The company managed to pare losses through the heroics of Goldman Sachs, and “only” lost $457 million (which led to its dissolution).

Devs were tasked with porting their HFT bot to an upcoming NYSE API service that was announced to go live less than a 33 days in the future. So they started a death march sprint of 80 hour weeks. The HFT bot was written in C++. Because they didn't want to have to recompile once, the lead architect decided to keep the same exact class and method signature for their PowerPeg::trade() method, which was their automated testing bot that they had been using since 2003. This also meant that they did not have to update the WSDL for the clients that used the bot, either.

They ripped out the old dead code and put in the new code. Code that actually called real logic, instead of the test code, which was designed, by default, to buy the highest offer given to it.

They tested it, they wrote unit tests, everything looked good. So they decided to deploy it at 8 AM EST, 90 minutes before market open. QA testers tested it in prod, gave the all clear. Everyone was really happy. They'd done it. They'd made the tight deadline and deployed with just 90 minutes to spare...

They immediately went to a sprint standup and then sprint retro meeting. Per their office policy, they left their phones (on mute) at their desks.

During the retro, the markets opened at 9:30 EDT, and the new bot went WILD (!!) It just started buying the highest offer offered for all of the stocks in its buy list. The markets didn’t react very abnormally, becuase it just looked like they were bullish. But they were buying about $5 million shares per second… Within 2 minutes, the warning alarms were going on in their internal banking sector… a huge percentage of their $2.5 billion in operating cash was being depleted, and fast!

So many people tried to contact the devs, but they were in a remote office in Hoboken due to the high price of realestate in Manhattan. And their phones were off and no one was at their computer.

The CEO was seen getting people to run through the halls of the building, yelling, and finally the devs noticed. 11 minutes ahd gone by and the bots had bought over $3 billion of stock. The total cash reserves were depleted. The compnay was in SERIOUS trouble...

None of the devs could find the source of the bug. The CEO, desperate, asked for solutions. "KILL THE SERVERS!!" one of the devs shouted!!

They got techs @ the datacenter next to the NYSE building to find all 8 servers that ran the bots and DESTROYED them with fireaxes. Just ripping the wires out… And finally, after 37 minutes, the bots stopped trading. Total paper loss: $10.8 billion.

The SEC + NYSE refused to rewind the trades for all but 6 stocks, the on paper losses were still at $8 billion. No way they coudl pay. Goldman Sachs stepped in and offered to buy all the stocks @ a for-profit price of $457 million, which they agreed to. All in all, the company lost close to $500 million and all of its corporate clients left, and it went out of business a few weeks later.

Now what was the cause of the bug? Fat fingering human error during release.

The sysop had declined to implement CI/CD, which was still in its infancy, probably because that was his full-time job and he was making like $300,000 in 2012 dollars ($500k today). There were 8 servers that housed the bot and a few clients on the same servers.

The sysop had correctly typed out and pasted the correct rsync commands to get the new C++ binary onto the servers, except for server 5 of 8. In the 5th instance, he had an extra 5 in the server name. The rsync failed, but because he pasted all of the commands at once, he didn't notice...

Because the code used the exact same method signature for the trade() method, server 5 was happy to buy up the most expensive offer it was given, because it was running the Sad Path test trading software. If they had changed the method signature, it wouldn't have run and the bug wouldn't have happened.

At 9:43 EDT, the devs decided collectively to do a "rollback" to the previous release. This was the worst possible mistake, because they added in the Power Peg dead code to the other 7 servers, causing the problems to grow exponentially. Although, it took about 3 minutes for anyone in Finance to actually inform them. At that point, more than $50 million dollars per second was being lost due to the bug.

It wasn't until 9:58 EDT that the servers had all been destroyed that the trading stopped.

Here is a description of the aftermath:

It was not until 9:58 a.m. that Knight engineers identified the root cause and shut down SMARS on all the servers; however, the damage had been done. Knight had executed over 4 million trades in 154 stocks totaling more than 397 million shares; it assumed a net long position in 80 stocks of approximately $3.5 billion as well as a net short position in 74 stocks of approximately $3.15 billion.

28 minutes. $8.65 billion inappropriately purchased. ~1680 seconds. $5.18 million/second.

But after the rollback at 9:43, about $4.4 billion was lost. ~900 seconds. ~$49 million/second.

That was the story of how a bad software decision and fat-fingered manual production release destroyed the most profitable stock trading firm of the time, and was the most expensive software bug in human history.

2 more...

For anyone who wonders, this is related to cryptocurrencies

6 more...

Blocking the DNS was the first thing I did. This is intended to restore the map feature without having to trust a random company I've never heard of.

What do you mean by "a diff of a code fix" that would be simpler ?

6 more...

When I mentionned that "I can confirm it is not realistic to self-host a tile provider", it's because I tried to run maptiler : it maxed out my CPU for 2 hours before my disk got filled while trying to generate the tiles from OSM data (and it was just for France)

Edit : Anyway, I don't think this should be in Immich's scope. Simply providing an easy option to switch tile providers would allow people motivated enough to host maptiler to use it

Edit bis : More details on how hard it is to host your own tile provider are available on the official OSM wiki

What's up with all the shilling posts lately?

This has existed since at least 2018 according to their Twitter, and is related to crypto currencies through its Radworks DAO

Edit : I'm not saying OP themselves is a shill. Radicle did a pretty goog job at hiding its cryptocurrency ties. They even renamed their token from Radicle to Radworks a few years ago. It seems like cryptobros are adapting to the fact that being related to cryptocurrencies hinders adoption among technical people.

6 more...

In my experience, OnlyOffice has the best compatibility with M$ Office. You should try it if you haven't

Quoting one dev from the conversation I had on Discord :

the one run by OSM is not intended for general purpose use because that results in way too much load on their system. We used to use theirs, but as Immich grew we decided that we should relieve them of that

I guess you (and they) are talking about raster tiles, since OSM does not seem to provide vector tiles

7 more...

I think they do get marked as dead after the Bodis subdomain does not act as a Lemmy instance. But I was wondering if a large number of instances "waking up from the dead" and acting maliciously could cause some trouble. Or would such "undead" instances pose no more threat to the fediverse than the same number of newly created malicious instances ? I'm mainly thinking about stuff like being in a privileged position to DoS most instances at once, or impersonation of accounts that used to actually exist on these "undead" instances

1 more...

At this point, I'll just assume you are trolling and stop replying after this comment.

This post is trying to provide a generic solution to the fact that there are no reasonable way to get map tiles without relying on a third party provider.

I additionally included instructions on how to set it up with Immich, but I don't see how a caching proxy in front of OSM should be part of Immich, a software focused on managing photo libraries.

1 more...

You can, but you would not be able to display the map. Might as well disable the map server-wide

How does an nginx config fit as a "diff" when the Immich repo and docker images do not include nginx (or any other reverse proxy) ?

3 more...

Thanks for the detailed feedback. According to one Immich dev, they used to use OSM's raster tile provider but switched away from it since they were causing too much load on OSM's servers.

There does not seem to be any non-commercial vector-tile provider at the moment (though OSM seems to be currently working on it), and it seems really overkill to try and self-host a tile provider (at least with the default level of details). Maybe the way is to find a balanced level of details that makes it reasonable to self host

5 more...

Not yet, but I will probably submit a PR to include this guide in the docs

I don't game that much on pc anymore, but this reminded me of this post about Linux gamers providing good bug reports.

Your sensitive data and logins are tied to email addresses, which are tied to domains. Lose your domain, someone can access everything.

I recently stumbled upon an article showing how bad this can be when the expired domains were used for important/serious stuff

No need to be rude...

never stopped POSTing, even though I configured nginx to always respond 403 to anything from them for about a year now.

Lol, there are definitely some stubborn user agents out there. I've been serving 418 to a bunch of SEO crawlers - with fail2ban configured to drop all packets from their IPs/CIDR ranges after some attemps - for a few months now. They keep coming at the same rate as soon as they get unbanned. I guess they keep sending requests into the void for the whole ban duration.

Using 418 for undesirable requests instead of a more common status code (such as 403) lets me easily filter these blocks in fail2ban, which can help weed out a lot of noise in server logs.

Well, Watts are just a different way to write Joules per second. The unit we should eliminate is {k,M}W.h which introduce a 3.6 factor in conversions to/from the regular unit system

6 more...

With all the botting going on on Reddit, this whole Google AI deal makes me think of the recent paper that demonstrates that, as common sens would suggest, deep learning models collapse when successive generations are trained on the previous generations' output

Yeah, there is something oddly mesmerizing about projects that solve an "already-solved-in-a-more-efficient-way" problem in a weird way

The readme mentions "transcription time on CPU" so it's probably running locally

Things have been going well for me, using docker-mailserver.

I followed the setup guide, did everything in the DKIM, DMARC and SPF documentation page. The initial setup required more involvement from me than your standard docker-compose self-hosting deployment, but I got no issues at all (for now, fingers crossed) after the initial setup : I never missed any inbound e-mails, and my outbound e-mails have not been rejected by any spam filter yet.

However, I agree with everyone else that you should not self-host an important contact address without proper redundancy/recovery mechanism in case anything goes wrong.

You should also understand that self-hosting an email address means you should never let your domain expire to prevent someone from receiving emails sent to you by registering your expired domain. This means you should probably not use a self-hosted e-mail to register any account on services that may outlive your self-hosted setup because e-mail is frequently used to send password reset links.

Github is not really independent from Git, it's a git provider. (you could see it as Github being to Git what Gmail is to e-mails)

I personally blame apple for a lot of the tech illiteracy in the younger generations.

Nice paywall :/

Can you give examples of countries where mainstream media is not owned by billionaires ?

5 more...

Not so long ago, while trying to turn a side project of mine into a package, I bind mounted my home directory into a chroot.

Guess what happened when I rm -rfed the chroot...

2 years ago was already amazing for someone who tried to play CS 1.6 and trackmania using wine 18 years ago

They told me about hosting their own tile server earlier today. I'm really impressed by how fast they moved !

A pull request for a privacy page during the onboarding is in the works, and I've been working with them to update the settings page and documentation (with the goal of providing an easy way to switch map providers). They are also working on a privacy policy, and want to ship all of this in a few weeks as part of a single release.

Once again, I'm really impressed with how well they're handling this

2 more...

Are you suggesting something like continuous timezones? Thanks for bringing this nightmare to a whole new level! :)

2 more...

Thanks for sharing your experience and for the links.

Do you think it would be doable to make/host a tileserver that only generates the first few zoom levels for the whole planet by default, and is able to generate tiles for more detailed zoom levels only for specific locations ? I'm thinking of a feature where Immich asks the tile server to generate the appropriate tiles based on the locations of photos. Since we only ever zoom on locations where photos have been taken, and we often take several photos at the same locations, could this decrease the requirements enough for self-hosting ?

2 more...

Wow! I did not know about htop's built in support. That's awesome to know about, thanks!

I used to wonder what kind of nerd notices this kind of thing, now I'm one of them

Edit : If you want to join us :

  • you can run Pi-hole which is a self-hosted DNS server that allow monitoring/blocking DNS requests from devices configured to use it. In its default configuration, it acts as a network wide ad/tracker blocker.
  • On Android, you can install Rethink DNS. This will configure itself as a VPN on your device, forcing all traffic to go through it. This allows it to act as an on-device firewall that allow monitoring/blocking DNS requests and TCP/UDP connections. This is similar to the features of Pi-hole, but the fact that it's on-device allows it to be app aware : the logs will detail which app is responsible for which connection, and the allow/block rules can be app-dependent. The app honestly goes beyond all my expectations :
    • it does a good job at being easy to use by default
    • it is very configurable which gives you a lot of control if you want/need/can handle it
    • You can configure it to route traffic (after applying firewall rules) to a Wireguard VPN or through Orbot. (Apps that act as VPNs are not compatible with each other : you can only have one active at a time)
    • You can even configure several Wireguard interfaces at the same time, and route specific apps through specific tunnels

It does not seem to be the case. Was it the full domain for this instance ?

11 more...

I'll probably look into newer fancier options such as Caddy one day, but as far as I remember Nginx has never failed me : it's stable, battle tested, and extremely mature. I can't remember a single time when I've been affected by a breaking change (I could not even find one by searching changelogs) and the feature set makes it very versatile. Newer alternatives seem really interesting, but it seems to me they have quite frequent breaking changes and are not as feature rich.

That being said, I'd love to see side-by-side comparison of Nginx and Caddy configs (if anyone wants to translate to Caddy the Nginx caching proxy for OSM I shared earlier this week, that would make a good and useful example), as well as examples of features missing from Nginx. This may give me enough motivation to actually try Caddy :)

(edit : ad->and)

The closing parenthesis got caught into the link (at least with my client), turning it into a 404. You should add a space

I don't know if you're referring to me, but I've previously discussed this idea several times in similar posts' comments.

I think we could implement it as a separate server software that generically allows aggregation of ActivityPub feeds under separate ActivityPub feeds.

I don't use Traefik myself, but this documentation page seems to suggest that Traefik only allows in-memory cache (which would eat RAM and not persist across reboots). You can probably run Nginx with this config inside a container for the caching, then use Traefik to handle requests to immich.your-domain.tld/map_proxy/* with the caching proxy container.

Each time you send a packet over the internet, several routers handle this packet without touching the source and destination IP addresses.

There is nothing stopping him from configuring the VPS in a way that forwards packets from the home server, rewriting the destination IP (and optionally destination port as well) but leaving the source IP intact.

For outgoing packets, the VPS should rewrite the source (homeserver) IP and port and leave the destination intact.

With iptables, this is done with MASQUERADE rules.

This is pretty much how any NAT, including ones behind home routers, work.

You then configure the homeserver to use the VPS as a gateway over wireguard, which should achieve the desired result.