ChatGPT, how do I use OCR in Word?

EnterOne@lemdro.id to

Technology@lemmy.world – 615 points – 1 years ago

ChatGPT is a revolution in surrealism.

No joke, there's a whole world of memes and interpretations we can get from them

What is it showing? What did it learn from in order to do that?

Like r/DisneyVacation, but with whatever the AI was smoking in slide 4

6 more...

This is the only valid take tbh

6 more...

It then gave me step-by-step text instructions on how to use the OCR feature in Microsoft Word to import text from a picture, and admitted in step 3 that the function doesn't exist. There were 6 steps.

and admitted in step 3 that the function doesn't exist.

🤣

1 2 5

4 3 6

3 3 8

I didn't even see the numbers at first

All I could think about when reading the numbers was the It Crowd emergency phone number song: https://youtu.be/GTRil00Lfhc

... 3

Here is an alternative Piped link(s):

https://piped.video/GTRil00Lfhc

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

I noticed that on my fourth read through. I'm still finding new obscure things in it.

Reading this is what I imagine having a stroke feels like.

2 more...

Here's the Linux version of this:

I was following it correctly up until the part where you have to place a child on your laptop. I wish these things would let you know the parts required beforehand.

Don't forget the requisite top-hat!

Also the hover-tongs.

Just use you neighbour's child

Just take one off the street, they're free

I regularly feel like I've turned into a magician when 30 layers deep into my process of "fixing" something. So at least that is accurate.

But are you using child labor yet?

I kinda think it's ChatGPT's interpretation of Tux?

Whoa. Yes!

"The design is very human"

This is some /r/surrealmemes shit right here.

Part of me wants AI to never evolve so it can keep making images like these forever.

probably the best actual outcome tbh

Old technology never dies.

Except when it's closed source and on a company server somewhere

The specific programs may be lost but the idea behind them won't be.

Flash isn't even that old and it's already dead

What is fun to me is that it completely made up a bunch of computer and office accessories that don't exist.

Just wait, soon we'll all be editing documents using tiny scalpels.

Of course, the only person who would try to do this is doing an essay about marijuana.

That was common in graphical design back in the (pre 90s) day.

I worked in a print shop in the 90s (until 95) and we still used xacto knives for our layouts. We had a computer but on now really knew how to use it for graphic design yet.

The text in the image represents how accurate it tends to be whenever I try to OCR a document.

for windows use, try powertoys's powerocr

It uses the windows built in API for ocr. Isn't very good in my experience.

If you're not getting good results, have you tried Ocr asegontorrittln the image first?

fwiw I've used it pretty extensively on screenshots of text I keep getting sent at work, so far I haven't noticed any mistakes at all. may just be the type of images though

Despite the constant negative press covfefe

This is why I'm convinced no LLM could ever accurately produce the insane and moronic shit he comes up with.

Maybe with a small language model.

Yuge language model

Laugh while you can, fellow meat bags.

I'm actually impressed by the reasonably coherent (though nonsense) text. If you think about how generative AI works it's very surprising it could form words in images.

Microsoft's image generator has been getting better and better at text. There are still plenty of problems, especially with small text, but someone on another forum was able to get it to output this with a very small prompt:

It's surprisingly good at making nonsense

what do you mean nonsense

Beautiful, it's a work of art unparalleled in the modern era

I want this on a t-shirt

Last step, "diable" is the devil in French. Therefore the last text more or less means "OCR the text into the devil's text"

This looks like scam email and Aliexpress products merged together.

Step 4: get baked

That little green... thing on the left looks high AF.

Ah shit. It lost me at painting the QR code by hand.

That’s how I always do it myself.

I'm stuck on step 3

Which step 3?

Which one of them?

I prefer to convord ttp manually rather than use the trext tims.

I prefer to convord ttp manually rather than use the trext tims.

But then how do you ensure that the text will be diåble?

I have never seen ChatGPT produce images. Is this a feature of 4.0?

Yeah is this linked with dall-e?

It is. The paid version (GPT-4) is integrated with DALLE-3.

This has all the hallmarks of "human pretending to be an AI" rather than actual AI output

I disagree. This is as you say Precisely the type of thing that happens when an image generator is asked to make a chart/diagram, so to me it seems a really wild leap to go from "This looks like exactly what happens when X" to "someone must have designed this to look like what happens when X".

If it were human designed, I think it would be intentionally funny (which realistically would backfire, but anyway...)

(And besides, paid ChatGPT does indeed connect to DALL-E 3 now)

Tbf I thought DALL-E3 was still just available via bing image creator, missed the memo that ChatGPT was hooked up to it too.

Still, for me though it still looks like it's human generated to try and be funny (it's just haha-AI-so-silly isn't groundbreakingly funny any more). It's mostly the information continuity throughout the image that I've not really seen from an image generating AI before (especially when not even prompted for it), and I've had a play around with DALL-E3 so I would expect the ChatGPT version to be equivalent.

Maybe I'm too cynical, but this just reeks of fake to me.

I tried the same prompts as OP, it didn't generate an image at first instance - had to ask it to generate one. This is the image I got:

@EnterOne@lemdro.id

Ropy from pituge

ChatGPT takes the liberty of creating a DALL-E prompt that it doesn't feel the need to share with the user. You can, however, ask ChatGPT to share the exact prompt and seed with you to reproduce the image. Here is the actual prompt and seed DALL-E ended up working with:

Prompt: "A step-by-step visual guide on using Optical Character Recognition (OCR) in Microsoft Word. The guide includes steps like opening Microsoft Word, inserting an image into a Word document, selecting the image, and using the OCR feature to convert the text in the image into editable text. The layout should be clear and easy to follow, with each step labeled and illustrated in a user-friendly manner, catering to users with basic proficiency in Microsoft Word."

Seed: 3993182816

To be clear, ChatGPT decided on its own to create and send this prompt to DALL-E in response to my request for tech support.

Why do you think that?

There's a level of continuity in the image you don't get with image generating AI yet.

Also it's littered with "AI getting things slightly wrong* memes

Also also, ChatGPT doesn't output images

It does: https://openai.com/blog/chatgpt-can-now-see-hear-and-speak

Edit: here's one I did now

Ah fair play, I missed that memo, the first two points still apply though

Yep, sure, it's a wild world we live in and this topic is changing fast. Missing this memo won't matter when the next one will be the next generation but generations are only 6 months apart.

That's how you know the AI is good! actually.

I had a lot of fun asking it to draw ASCII art for me... especially if you ask it for corrections about specific aspects of its art

Okay I've convorded the ttp by using the trext tins, but I'm not sure what comes next or why I'm holding a paint brush.

ChatGPT is like the special kid in school that can't have scissors or glue unattended.

And people are terrified at the idea of AGI. Lol. Lmao even.

It's not AGI that's terrifying, but how people are so willing to let anything take over their control. LLMs are "just" predictive text generation with a lot of extras to make things come out really convincing sometimes, and yet so many individuals and companies basically handed over the keys without even second guessing its answers.

These past few years have shown how if (and it's a big if) AGI/ASI comes along, we are so screwed, because we can't even handle dumber tools well. LLMs in the hands of willing idiots can be a disaster itself, and it's possible we're already there.

Cuil.

opjical carttcer recegnition

why is dream stuff

Step OCCR what are you doing

This is glorious