Corbin

@Corbin@programming.dev
2 Post – 69 Comments
Joined 1 years ago

Because frankly, Ronald (the current maintainer, not the original author) is very competent. I say this as somebody who has personally been yelled at by Ronald at a kernel summit; I didn't deserve it, but none of his technical points were wrong. I like to think of myself as the kind of person that, given enough time and documentation, can maintain anything; I think it'd still take three of me to do Ronald's job. (Well, "job." I think he technically works for Red Hat or something?) Not to excuse his conduct, just to explain why he's not been replaced yet.

3 more...

Most consumer-grade NICs have a default MAC address which is retrievable with device drivers, but delegate (Ethernet) packet assembly to the OS. If the OS asks the NIC to emit a packet, then the NIC often receives the packet as a blob, DMA'd from main memory, and emits the bytes as octets. Other NICs do manage packet assembly, but allow overwriting the default MAC address. By the time I was learning Linux, we had GNU MAC Changer available in userland with the macchanger command, and many distros have configuration for randomizing or hardcoding MAC addresses upon boot.

I want to say that this is all because olden corporate network management policies could require a technician to replace a NIC without changing the MAC address, but more likely it is because framing and packet assembly was not traditionally handed to a second controller, and was instead bit-banged or MMIO'd by the CPU.

It's because most of the hard questions and theorems can be phrased over the Booleans. Lawvere's fixed-point theorem, which has Turing's theorem and Rice's theorem as special cases (see also Yanofsky 2003), applies to Boolean values just as well as to natural numbers.

That said, you're right to feel like there should be more explanation. Far too many parser papers are written with only recognition in mind, and the actual execution of parsing rules can be a serious engineering challenge. It would be nice if parser formalisms were described in terms of AST constructors and not mere verifiers.

Hi! Please don't link anything from this subdomain again. It was considered a plague back on Reddit, and this sort of content-free post shouldn't be encouraged here either.

[HTML and Markdown] are not grammatically Type 2 (Chomsky-wise, Context-Free); rather, they are Type 3 (Chomsky-wise, Regular).

This is at least half-wrong, in that HTML is clearly not regular. The proof is simple: HTML structurally embeds a language of balanced parentheses (a Dyck language), and such languages are context-free and not regular. I don't know Markdown well and there are several flavors; it's quite possible that some flavors are regular. However, traditional Markdown embeds HTML, so if HTML is not regular than neither is Markdown.

I once did a syntax-directed translation of Markdown to HTML in AWK!

Sure. The original Markdown implementation was in Perl and operated similarly. However, note that this doesn't imply that either language is regular, only that a translation is possible assuming the input is valid Markdown. Punting on recognition means that invalid parse trees have undefined translations, presumably at least sometimes generating invalid HTML.

2 more...

Don't use OpenAI's outdated tools. Also, don't rely on prompt engineering to force the output to conform. Instead, use a local LLM and something like jsonformer or parserllm which can provably output well-formed/parseable text.

1 more...

Nailed it. I think about this a lot: a sysadmin is basically a manager of dozens, hundreds, or even thousands of computers. But management is a poor way of orchestrating human labor; small teams usually operate better without management. So, is there a better way to administer computer systems as well?

2 more...

As a society and as individual computer scientists, none of us actually know what a computer is or how to use them. All programming languages are guesses, mere attempts to encode our natural-language reasoning and philosophy in the purely syntactic and formal fashion required by computers. Don't let yourself become biased in favor of specific languages; instead, understand that all languages are bad in different ways.

2 more...

This shit is why I cannot recommend Truffle/Graal. Yes, it's cool technology. Yes, it works well. Yes, I remember Chris Seaton. Yes, most of it is Free Software. However, Oracle is still the fucking lawnmower, and it's not safe to build upon anything they can convince a judge they might own.

Alternatives include RPython (my preference) and also GNU Lightning.

I've only skimmed the paper, so let me know if I've missed something, ideally with a page number. Also, it's late and I'm tired, so I'm not hyperlinking anything; sorry.

I'm not sure what a "full semantic analysis" entails, but always keep Rice's theorem in mind: there aren't any interesting semantic analyses available for Turing-complete systems.

Python is a descendant of Smalltalk. Like several of its cousins, particularly the famous ECMAScript, Python doesn't have types or classes in the Smalltalk sense, but prototypes which form a class-like hierarchy. From the static-analysis point of view, whether a type is created or instantiated is a matter of Rice's theorem.

The ability to invoke type() at runtime is not lazy. Python is eager and strict; even generators are eager and strict, although they can cause stack frames to become "stale"; whether a stale stack frame is cleaned up is also a matter of Rice's theorem.

None of this prevents compilation of Python. The RPython toolchain first imports an application, evaluating all calls to type() and pre-building all classes; then, it statically analyzes all of the Python objects in memory and decompiles their bytecode to determine their behaviors. The resulting executable behaves as if it were started from a snapshot of the Python heap.

Yes, CPython sucks. Use PyPy instead; also, use cffi to wrap C libraries.

4 more...

Walter Bright has fairly odious political opinions; like many social conservatives these days, he likes to complain about wokeness and communism, and I would completely understand a community fork simply to remove his control over various parts of the D language.

Also, just for a quick sanity-check: Which languages have you invested/migrated to, only to find that "political stunts" had a "negative impact" on your planned development?

4 more...

Mattermost is the most obvious option; it's a clone of Slack. IRC is another good option, although I know a lot of people hate it because they prefer features to freedom. I cannot recommend Matrix; the UX is fine but the cryptography has a few issues, as documented by Soatok here.

Object-oriented design is about message-passing; messages are more important than objects. Classes are completely irrelevant -- there's an entire branch of object-oriented language design without classes!

2 more...

C'mon, I think you have better reading comprehension than that. He's a professional data scientist specializing in machine learning. He went to grad school, then to big industry, then to startups, and is currently running a consultancy. He is very clearly not "on the side of the road." He's merely telling executives to fuck off with their AI grift.

1 more...

python3Packages.scikit-image appears to be available and non-broken in nixpkgs; on my machine, I get /nix/store/w8681ncsw92cn4gq6gyraw4z19r0r6c3-python3.11-scikit-image-0.21.0. Do you have an actual example?

I understand your point, but given nixpkgs' position in the community, it might be a moot point.

There are subfields of computer science dedicated to this question. A good starting point for the theory would be Pessimal algorithms and simplexity analysis, which lays out two concepts:

  • The time & space simplexity of an algorithm indicates best-case lower bounds on resource usage, and
  • An algorithm is pessimal if no equivalent algorithm wastes more time/space/etc.

For example, common folklore is that sorting has O(n lg n) time complexity, depending on assumptions. In the paper, they give that sorting has Ω(n ** (lg n / 2)) time simplexity; any algorithm which takes more time, like bogosort, must do so through some sort of trickery like non-determinism or wasting time via do-nothing operations.

Oracle Ruined America's Cup (Larry Ellison)

You are very close to a deep truth of language design: all large popular languages slowly die from committees overloading them with cruft and redundant features, and at the end of their healthspans, they become painful for all involved. In your list, this includes both PHP and ECMAScript; Perl 5 avoided this fate, but Raku certainly suffers from it.

Free Software is literally communist.

1 more...

You're cheering for exploitation of a commons.

1 more...

Yeah, this list of sites is making me think of asking for a book by loudly asking a library, a series of coffeeshops, a chud microbrewery, and an 11-year-old bully. Try quietly reading in the library first, I guess.

You are correct. For the example of regular languages, we have Kleene algebras, which are special cases of *-semirings. Similar algebras exist for the rest of the Chomsky hierarchy.

Before going up the hierarchy, I would recommend checking out what we can do with semirings alone. Two great papers on the topic are "Fun with Semirings", Dolan 2013 and "A Very General Method of Computing Shortest Paths", O'Connor 2011. Don't be fooled by the titles; they both involve surprise guest appearances from regular expressions.

They have two. If the complaint is that neither wiki is as rich as the Gentoo or Arch wiki, consider that perhaps NixOS users don't need as much supplementary advice for configuring their systems.

1 more...

Lucky 10000: It's a pun. A quaver is a duration of a musical note in the UK, equivalent to a USA eighth note; a semidemihemiquaver is a sixtyfourth note, used to notate e.g. certain kinds of trumpet trills.

Learn finance and bookkeeping; work for a bank. Software development is not lucrative; the high-paying jobs are fundamentally tough and cause burnout. Median employment at big software companies is maybe 2-3yrs and it will ruin your ability to relate to other humans.

I think they're saying that e.g. you shouldn't index a natural key unless you know that you're going to search/collate by that key as a column. Telling the database that a certain column contains (a component of) the primary key is adding a restriction to that column.

PyPy exists. (This is news to around 95% of the community.)

You tried to apply far too much pressure over too large a surface area. Either make a more focused approach by not chasing Free Software and XMPP supremacy at the same time, or find ambient ways to give people options without forcing them to make choices in the direction you want. In particular, complaining about bridges usually doesn't get the discussion to a useful place; instead, try showing people on the other side of the bridge how wonderful your experience is.

Also, I get that you might not personally like IRC, but you need to understand its place in high-reliability distributed systems before trying to replace it; the majority of them use IRC instead of XMPP for their disaster recovery precisely because its protocol jankiness makes it easier to wield in certain disaster situations.

You'll have to trust me from when I worked at Google at the same time as her. In particular, you'll have to trust me that she called via public petition for an end to democracy and for Eric Schmidt to be installed as CEO of the USA and granted dictatorial powers; this is a flavor of fascism known as corporatism or corporate fascism. (Alternatively, you might trust the Internet Archive's copy of the Guardian's story from that time. The original petition isn't up anymore.)

::: spoiler dog whistles "Cosmopolitan" is a reference to a common component of anti-Jewish conspiracy theories and means the same thing as "globalist". "Ape" is one of many common slurs for African-Americans and Africans. "Llamaphile" is part of a common "-phile" pattern of anti-furry slurs. :::

You're not crazy or harsh. This is a very real problem. I have been stuck doing business development at every company I've worked for. There's always some shitty load-bearing Django app whose schema determines what the business is capable of doing, and somebody's gotta maintain it. It's gotten to the point where I assume that any interesting things I do will be outside of work and not for pay.

If it's on Stack Exchange, you can help us keep the community decent by assuming good faith and being patient with newcomers. Yes, it's frustrating. And yeah, sometimes, it's basically impossible to avoid sarcasm and scorn, just like how HN sometimes needs to be sneered at, but we can still strive for a maximum of civility.

If all else fails, just remember: you're not on Philosophy SE or any of the religious communities, it's just a computer question, and it can be answered without devolving into an opinion war. Pat yourself on the back for being a "schmott guy!" and write a polite answer that hopefully the newbies will grok. Be respectful of plural perspectives; it's a feature that a question may have multiple well-liked answers.

Well put. And this is a generic pattern; for example, GPUs are only faster than CPUs if the cost of preparing the GPU and retrieving the result is faster than directly evaluating the algorithm on the CPU. This also applies to main memory! Anything outside of the CPU can incur a latency/throughput/scaling tradeoff.

You may be pleased to know that PyPy's Python 2.7 branch will be maintained indefinitely, since PyPy is also written in Python 2.7. Also, if you can't leave CPython yet, ActivePython's team is publishing CPython 2.7 security patches.

Extension modules are implemented in C because the interpreter is written in C. If it were written in another language, folks would write extension modules for that language instead. Also, it would be less relevant if people used portable C bindings like cffi, which are portable to PyPy and other interpreters… but they don't.

It looks alright. You'll have to use it for a few months before knowing whether it's comfortable.

To be honest, I'm not a fan of variables; I'm in the tacit/concatenative camp. But I think it's good to try new things and learn for yourself why they are good or bad.

Yeah, some folks have trouble with pointers, and computer-engineering curricula are designed to discourage folks from taking third-year courses if pointers don't make sense. It's a stereotype for a reason. I'd love to know if there's an underlying psychological explanation, or if pointers are just...hard.

4 more...

Thanks for offering your perspective! It's important that we keep in mind that not everybody who studies computer science becomes a professional programmer, and you've offered us good food for thought.

For what it's worth, pointers are fundamental for Von Neumann machines, which are very common in the computing world; your current machine and the machine serving this page are both Von Neumann. In such machines, memory doesn't just store data, but also instructions; the machine has an instruction pointer, which is a pointer referencing the currently-executing instruction in memory. So, if one wants to understand how a computer jumps from one instruction to another, then one must somewhat understand pointers.

There are Python compilers which do AST analysis instead of bytecode analysis, particularly Nuitka and Shed Skin. They aren't very good, but it's not clear whether that's because working with the AST is somehow harder than working with the bytecode. RPython doesn't compile all bytecodes; most generator/coroutine functionality is missing, for example.

Think of type-checking as a syntactic analysis; this is how it avoids Rice's theorem. Like you say, we can annotate names with type information, and we can do it without evaluating the code. The main problem here is that Python's semantics don't require these annotations to enforce the types of values; you may be interested in E, a research language from the 90s which did enforce type annotations on otherwise-untyped names. In Python, this doesn't error:

>>> x :int = "42"

But in E, this does error:

? def x :int := "42"
# problem: 

Sadly, E is long dead, and something of an archeological artifact rather than a usable system. But it may be inspiring to your future efforts, especially since it sounds like you're learning how to build compilers. (I helped write Monte, a language which blends E and Python; it is also dead, but was more enjoyable than E.)

1 more...

Microsoft is no longer able to outcompete the Free Software commons. That's all.

You might want to re-read the thread and think about how you sound, by the way. You're coming off as a concern troll, not as a member of the Free Software community.

Congratulations on taking a step towards self-hosting and meta-circular compilation. ASDL is a great intermediate meta-language and it can be used to abstract any sort of Builder or Command workflow. This is sometimes called a "narrow waist"; ASDL replaces ad-hoc AST-builder objects with a unified protocol and interface.

For example, I encoded Monte into ASDL while rewriting the compiler in Monte, as part of a self-hosting effort. I also wrote a module which parses ASDL and emits Monte modules, including AST-building tools at runtime. Monte supports E-style quasiliterals, including source-code literals. This let our compiler directly consume ASDL files and emit Monte source code. Going beyond the compiler, this allowed me to encode UI elements and widgets as ASTs and use the Command pattern to write widget-handling objects.

2 more...