Its not wrong though

alphacyberranger@lemmy.world to Programmer Humor@programming.dev – 805 points –
131

Open source ≠ Source availiable

Example of non open source programs with source code https://en.m.wikipedia.org/wiki/List_of_proprietary_source-available_software

Open source ≠ free software

Open source inherently means you can compile the code locally, for free. You can’t necessarily redistribute it, depending on the license, but I’m not aware of a “you can compile this source for testing and code changes only but if you use it as your actual copy you are infringing” license.

I am very much open to correction here.

Open source inherently means you can compile the code locally,

Open Source means more than that. It is defined here:

https://opensource.org/osd/

If you use the phrase "open source" for things that don't meet those criteria, then without some clarifying context, you are misleading people.

for free.

Free Software is not the same as "software for free". It, too, has a specific meaning, defined here:

https://www.gnu.org/philosophy/free-sw.html

When the person to whom you replied wrote "free software", they were not using it in some casual sense to mean free-of-charge.

Free as in free speech, not as in free beer

Most free software is also open source and vice versa, but not all, the difference usually lies in the licence, this stackexchange answer gets it pretty well

According to the Open Source Initiative (the folks who control whether things can be officially certified as "open source"), it basically is the same thing as Free Software. In fact, their definition was copied and pasted from the Debian Free Software guidelines.

You are talking about free softwares there are nonfree licenses which provide source code

There are apps having source public but does not have any developement practice like of open sauce

In this thread: Programmers disassembling the joke to try and figure out why it's funny.

Cute. It would be funnier if it was correct.

For people interested in the difference between decompiled machine code and source code I would recommend looking at the Mario 64 Decomp project. They are attempting to turn a Mario 64 rom into source code and then back into that same rom. It's really hard and they've been working on it for a long time. It's come a long way but still isn't done.

https://github.com/n64decomp/sm64

I thought they were done already?

There is still some stuff that needs documenting, but the original goal of recompiling the created source code into the ROMs has been achieved. People are still actively working on it, so in that sense it's maybe never done.

No, it is wrong. Machine code is not source code.

And even if you had the source code it may not necessarily qualify as open source.

well assembly is technically "source code" and can be 1:1 translated to and from binary, excluding "syntactic sugar" stuff like macros and labels added on top.

But those things you're excluding are the most important parts of the source code...

By excluded he means macro assemblers which in my mind do qualify as an actual langauge as they have more complicated syntax than instruction arg1, arg2 ...

The code is produced by the compiler but they are not the original source. To qualify as source code it needs to be in the original language it was written in and a one for one copy. Calling compiler produced assembly source code is wrong as it isn't what the author wrote and their could be many versions of it depending on architecture.

Never heard of a decompiler I see.

A decompiler doesnt give you access to the comments, variable names, which is an important part of every source code

What's cool is that you can interpret the var names yourself and rename them whatever you want.

But it is extremely time-consuming. Open source code makes it transparent and easy to read, that's what it is about: transparency

A decompiler won't give you the source code. Just some code that might not even necessarily work when compiled back.

From the point of view of the decompiler machine code is indeed the source code though

And? Decompilers aren't for noobs. So what if it gives you variable and function names like A000, A001, etc?

It can still lead a seasoned programmer where to go in the raw machine code to mod some things.

You're actually chatting with a hacker that made No-CD hacks.

Try converting from English to Japanese and back to English.

xor ax, ax

A fancy way to say do nothing is not the same as translating back and forth. Example: Show me the intermediate translation.

Also we live in a 64bit world now old man

Also that instruction does not do nothing, it resets the CPU register to zero without having to access RAM. Far from a NOP instruction.

Still not the actual source code, bucko.

No, it's actually better when you can read the machine code.

Most folks don't care to recompile the whole thing when all they wanna do is bypass the activation and tracker shit.

Having access to the source code actually makes reading machine code easier, so you're also wrong on this entirely different thing you're going on about.

You've clearly never used a disassembler such as HIEW have you? You get the entire breakdown of the assembly code.

I disassemble binaries daily for work. It's still not the same as source code.

I didn't say it was. I just said loosely what the OG meme said, if you know how to read assembly, you know how to read (and write) what some of the code does.

I never said disassembly or decompiling was easier in any way. I'll agree with you on that, it's way more difficult.

Back to the point of the meme though, if you can read assembly, you can read it all.

You've never actually compared source code to its compiled output, have you.

I've written drivers in 65 bytes of code. I don't tend to use high level languages that hide what's going on behind the scenes.

Okay, boomer here, be gentle.

So back in the ‘70s I dabbled in programming (now called “coding”, I hear). I only did higher-level languages like Fortran, Cobol, IBM Basic, but a friend had a job (at age 13!) programming in assembler. Is assembler now called assembly, or are they different?

It's still called programming, coding is the same thing. Assembler more commonly refers to the utility program that converts the assembly code to machine code while assembly refers to the code itself, but the term assembler code is also valid. It's uncommon to simply call the code assembler because it would be easily confused with the utility program.

Yep, some call it assembly, others call it assembler

I thought that the assembler is a specific program that translates mnemonics into the corresponding machine code. Perhaps in early computing this was done by hand so a person was the assembler (and worked in assembler), but now that is handled by software (and supports various macros). So programming in assembly would generate a stream of text that must be assembled by an assembler. (Although I have heard people refer to programming in assembler as well, just not often.)

I hear people say "program in assembler" but IMO that's wrong. I'd say you write the code in "assembly language" (or better yet, the actual architecture you're using like "x86 assembly") but you "assemble" it with an "assembler". Kind of like how you could write a program in the "C language" and "compile" it with a "compiler"

A compiler and an assembler do wildly different things though. An assembler simply replaces mnemonics while a compiler transfers instructions to a whole other language.

Depends on the language, really... C maps pretty closely to assembly language, it's not as simple as one mnemonic to one machine code byte, more like tokens get mapped to sequences of machine code, a function call translates to some code that sets up a stack frame, a return tears it down...

I was too young/poor to afford an assembler for my 6502 so I wore out the assembly long hand on a legal pad and then manually converted each operation to machine code.

Needless to say my programs done this way were exceptionally simple, but it’s interesting to understand the underlying code.

It just occurred to me that AI in the nearish future will probably/almost certainly be able to do this.

I can't wait for AI to make a PC port of every console game ever so that we can finally stop using emulators.

This won't happen in our lifetime. Not only because this is more complex than rambling vaguely correlated human speech while hallucinating half the time.

I think it'll be in our lifetime just not anytime soon. I feel like AI is gonna boom like the internet did. Didn't happen overnight and not even in a year but over 35ish years

Off the shelf models do this, yes.

Sophisticated local trained models on expensive private hardware are already dunking on publicly available versions. The problem of hallucination is generally resolved in those contexts

Sure but until I see such a thing I chose not to believe in fairy tales.

Decompiling arbitrary architecture machine code is quite a few levels above everything I've seen so far which is generally pretty basic pattern recognition paired with statistics and training reinforcement.

I'd argue decompiling arbitrary machine code into either another machine code or legible higher level code is in a whol other league than what AO has proven to be capable of.

Especially because with this being 90% accurate is useless.

Again you aren't seeing this because these models are being developed for private enterprise purposes.

Regarding deep machine code analysis, sure, that's gonna take work but the whole hallucination thing is an off the shelf, rookie problem these days

It's not, though. Hallucinations are inherent to the technology, it's not a matter of training. Good training can greatly reduce the likelihood, but cannot solve it.

Training doesn't solve hallucination. I didn't say that

Why does a pre-trained model need expensive private hardware after it was trained, other than to handle API requests faster? Is Open AI training chat-GPT on inferior hardware compared to these sophisticated private versions you mentioned?

The fine tuning, while much more efficient than starting fresh, can still be a large amount of work.

Then consider that your target corpus of data may also be large.

Then consider to do your reasoning tasks across that corpus also takes strong hardware to get production ready response times.

No, openai isn't using inferior hardware, but their model goals, token chunking strategies and overall corpus are generalist in nature.

There are then processing strategies teams are using to go beyond the "memory" limitations gpt 4 has, that provide massive benefits to coherency, essentially anti hallucination and better overall reasoning

Idk the specifics, but what you say makes it sound like it would be easier to create an AI that recreates a game based on gameplay visuals (and the relevant controls)

That game would still not work because there is a ton of hidden state in all but the simplest computer games that you cannot tell from just playing through the game normally.

An AI could probably reinvent flappy birds because there is no more depth than what is currently on screen but that's about it.

Ai prompt: make me a program that will convert PS5 games to PC

AI: Use Convert-PS5GameToPC

End of line

AI can literally read minds. I don't think it's that great of a step to say it should be able to decompile a few games.

About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.

Don't be surprised but about half of the time I can predict the result of a coin flip.

I'm not saying it's not interesting but needing custom training and an fMRI is not "an AI can read minds"

It can see if patterns it saw previously reappear in a heavily time delayed fMRI. Looking for patterns you already know isn't such an impressive feat Computers have done this for ages now.

It litterally can't read minds.

Later, the same participants were scanned listening to a new story or imagining telling a story and the decoder was used to generate text from brain activity alone. About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.

You left out the most important context about "half of the time". Guessing what you're thinking of by just looking at your brain activity with a 50% accuracy is a very very good achievement - it's not pulling it out of a 1 or 0 outcome like you're with your coin flip.

You can pretend that the AI is useless and you're the smartest boy in the class all you want, doesn't negate the accomplishments.

Being close (and "sometimes" precise) to the intended meaning is an equally useless metric to measure performance.

Depending on what you allow for "well close enough I think" asking ChatGPT to tell a story without any reading of fMRI would get you to these results. Especially if you know beforehand it's gonna be a story told.

4 more...
4 more...

It was a staple of Asimov's books that while trying to predict decisions of the robot brain, nobody in that world ever understood how they fundamentally worked.

He said that while the first few generations were programmed by humans, everything since that was programmed by the previous generation of programs.

This leads us to Asimov's world in which nobody is even remotely capable of creating programs that violate the assumptions built into the first iteration of these systems - are we at that point now?

No. Programs cannot reprogram themselves in a useful way and are very very far from it.

Eh, I'd say continuous training models are pretty close to this. Adapting to changing conditions and new input is kinda what they're for.

Very far from reprogramming though. The general shape of the NN doesn't change, you won't get a NN made to process images to suddenly process code just by training it.

Then how does polymorphic/self-modifying code work?

It doesn't or do you have serious applications for self-modifying code?

Some use it for causing millions of dollars in damage.

4 more...

It's honestly remarkable how few people in the comments here seem to get the joke.

Never stop dissecting things, y'all.

IDA Pro (a disassembler) is closed source but came with a license that allowed disassembly and binary modification. Unfortunately, that's no longer the case.

Why not use that NSA tool they released

Ghidra is open source even before you run the disassembler 🤯 great anecdote

Joke aside, that's kind of like claiming that any web frontend is open source because you can access the built, minified and often obfuscated source of it.

So true! I have been "hacking" some chrome extensions recently, do you know of a tool for reverse engineering JS?

If you wanna skip a few inconvenient instructions in X86 assembly, throw a few No Operation instructions in the right places.

NOP = 0x90

And so you add a hashing check. But then that can be removed.

So you need one in the OS but that can be removed.

So you need one in hardware.

In other words no matter how clever you are there’s always a way to monkey with something unless you have absolute control from silicon on up.

Here’s a really interesting video the Xbox team did on the challenges of trying to make sure that the content running wasn’t pirated.

https://youtu.be/U7VwtOrwceo

While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.

While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.

Nah, when the user wants to ensure trust and integrity in his own system, it works just fine. The problem comes when the user who needs to be able to access the data is simultaneously the adversary who needs to be stopped from accessing the data.

In other words, it's one of those situations where the fact that it's hard to manage is a gigantic clue that it's wrongheaded to try to do so in the first place.

I agree. I mean when doing secure channel communications or weapons systems or health biometrics.

There are cases where you need to be sure of the integrity of the data and environment

Meanwhile, I've been archiving terabytes of software with no DRM, with no account.

I've wondered: Can you go deeper than assembly and code in straight binary, or does it even really matter because you'd be writing the assembly in binary anyway or what? In probably a less stupid way of putting it: Can you go deeper than assembly in terms of talking to the hardware and possibly just flip the transistors manually?

Even simpler: How do you one up someone who codes in assembly? Can you?

The first computer I used was a PDP-8 clone, which was a very primitive machine by today's standards - it only had 4k words of RAM (hand-made magnetic core memory !) - you could actually do simple programming tasks (such as short sequences of code to load software from paper tape) by entering machine code directly into memory by flipping mechanical switches on the front panel of the machine for individual bits (for data and memory addresses)

You could also write assembly code on paper, and then convert it into machine code by hand, and manually punch the resulting code sequence onto paper tape to then load into the machine (we had a manual paper punching device for this purpose)

Even with only 4k words of RAM, there were actually multiple assemblers and even compilers and interpreters available for the PDP-8 (FOCAL, FORTRAN, PASCAL, BASIC) - we only had a teletype interface (that printed output on paper), no monitor/terminal, so editing code on the machine itself was challenging, although there was a line editor which you could use, generally to enter programs you wrote on paper beforehand.

Writing assembly code is not actually the same as writing straight machine code - assemblers actually do provide a very useful layer of abstraction, such as function calls, symbolic addressing, variables, etc. - instead of having to always specify memory locations, you could use names to refer to jump points/loops, variables, functions, etc. - the assembler would then convert those into specific addresses as needed, so a small change of code or data structures wouldn't require huge manual process of recalculating all the memory locations as a result, it's all done automatically by the assembler.

So yeah, writing assembly code is still a lot easier than writing direct machine code - even when assembling by hand, you would generally start with assembly code, and just do the extra work that an assembler would do, but by hand.

Yes, you can code in machine code. I did it as part of my CS Degree. In our textbook was the manual for the particular ARM processor we coded for, that had every processor-specific command. We did that for a few of the early projects in the course, then moved onto Assembly, then C.

Assembly effectively is coding in binary. Been a long time since I've looked at it, but you'd basically just be recreating the basic assembly commands anyway.

I guess you could try flipping individual transistors with a magnet or an electron gun or something if you really want to make things difficult.

If you actually want to one-up assembly coders, then you can try designing your own processor on breadboard and writing your own machine code. Not a lot of easy ways to get into that, but there's a couple of turbo dorks on YouTube. Or you could just try reading the RISC-V specification.

But even then, you're following in someone else's tracks. I've never seen someone try silicon micro-lithography in the home lab, so there's an idea. Or you could always try to beat the big corps to the punch on quantum computing.

You can code in binary, but the only thing you'd be doing is frustrating yourself. We did it in the first week of computer science at the university. Assembly is basically just a human readable form of those instructions. Instead of some opcode in binary you can at least write "add", which makes it easier to see what's going on. The binary machine code is not some totally other language than what is written in the assembly code, so writing in binary doesn't really provide any more control or benefit as far as I'm aware.

All those assembly language instructions are just mnemonics for the actual opcodes. IIRC, on the 6502 processor family, JSR (Jump to SubRoutine) was hex 20, decimal 32. So going deeper would be really limited to not having access to the various amenities provided by assembler software and writing the memory directly. For example:

I started programming using a VIC-20. It came with BASIC, but you could have larger programs if you used assembly. I couldn't afford the assembler cartridge, so I POKED the decimal values of everything directly to memory. I ended up memorizing some of the more common opcodes. (I don't know why I was working in decimal instead of hex. Maybe the text representation was, on average, smaller because there was no need of a hex symbol. Whatever, it doesn't matter...)

VIC-BASIC had direct memory access via PEEK (retrieve value) and POKE (set value). It also had READ and DATA statements. READ retrieved values from the comma-delimited list of values following the DATA statement (usually just a big blob of values as the last line of your program).

I would write my program as a long comma-delimited list of decimal values in a DATA statement, READ and POKE those values in a loop, then execute the resulting program. For small programs, I just saved everything as that BASIC program. For larger programs, I wrote those decimal values to tape, then read them into memory. That let me do a kind of modular programming by loading common functions from tape instead of retyping them.

I was in the process of writing my own assembler so that I could use the mnemonics directly when I got my Apple //c. More memory and the availability of quite a few high level languages derailed me and I haven't touched assembly since.

Re: Coding in binary. It makes no difference. Your assembly is binary, just represented in a more human readable form when writing it in assembly.

Re: Manual interaction. Sure there's plenty of old computers where you can flip switches to input instructions or manipulate registers (memory on the cpu). But this is not much different from using assembly instructions except you're doing it live.

You can also create purpose built processors which might be what you mean? Generally this isn't too useful but sometimes it is. FPGAs are an example of doing this type of thing but using software to do the programming of the processor.

this isn't too useful

The point isn't to be good, practical or useful. It's to be cool 😎

But this silly question still informed me of something I had misunderstood: I had thought assembly and machine code were the same thing.

1 more...

You could like make a simple accumulator machine out of logic gates and enter binary instructions expressed in hexadecimal into its register to program it, yeah, but it's not capable of all the operations of a computer. But yes the first programming was just op codes, switches flipped or punch cards, there was no assembly language. But assembly language is pretty much just mnemonics for operations and registers. Like I had to write a couple C programs in school and use GNU C compiler to disassemble them into x86 assembly and see what it was doing on that level, then we "wrote" some x86 assembly by copypasting a lot of instructions but its not that hard to make something that works in like x86 assembly or like Jasmin (Java virtual machine assembly language) if it's simple enough.

1 more...

You can have the code of any software with a decompiler. Especially with Java and C# for example.

Open source code refers to the comments and the documentation.

so, like half (more?) of current 'open source' isn't, then? because it lacks in one or the other.. or both?

Yeah but which version of assembly

Microsoft's Assembly# of course. It's new. It's just different enough to extinguish assembly

Depends on the CPU. Either way there are cross-compilers and cross-disassmblers.

And even failing those options, there's always hex editors for those really in the know.