Artificial Intelligence

DrChaos · Jun 10, 2024

heltok said:
Ok, sell your stocks.

And I really don't think he is bored. Optimus, Dojo vs nVidia, Cybercab 8/8, FSD China, 25k Model and AGI race with OpenAI for the stake of humanity. Doubt he finds that boring...

His personality and management is dissuading the high end ML scientists who have options. Who would choose Tesla over OpenAI, Anthropic, FAIR, Microsoft or Deep Mind? Particularly with Musk's ill informed hot takes and aggressive firing and chaotic demands.

With musk running things more personally they're not going to beat nVidia, or openAI or BYD at this rate. He killed off the 25K model which was the key to making a profitable robotaxi as it should have been on mostly the same platform.

Beating GM is much easier, they should try keeping going with that.

heltok · Jun 10, 2024

DrChaos said:
Who would choose Tesla over OpenAI, Anthropic, FAIR, Microsoft or Deep Mind?

I think many of the hard working AI experts listen to these guys and resonate with them:

x.com

https://twitter.com/x/status/1799734177702764712

Imo Tesla are not lacking for applications to their job listings.

eevee-fan · Jun 11, 2024

DrChaos said:
His personality and management is dissuading the high end ML scientists who have options. Who would choose Tesla over OpenAI, Anthropic, FAIR, Microsoft or Deep Mind? Particularly with Musk's ill informed hot takes and aggressive firing and chaotic demands.

With musk running things more personally they're not going to beat nVidia, or openAI or BYD at this rate. He killed off the 25K model which was the key to making a profitable robotaxi as it should have been on mostly the same platform.

Beating GM is much easier, they should try keeping going with that.

Elon's whole argument of 25% control is just silly. Pretty sure there are more than enough Bluepill shareholders who would gladly sign the stock voting rights to him where he would then have at least 40% voting rights.

Cosmacelf · Jun 11, 2024

eevee-fan said:
Elon's whole argument of 25% control is just silly. Pretty sure there are more than enough Bluepill shareholders who would gladly sign the stock voting rights to him where he would then have at least 40% voting rights.

Actually he was counting on them so that his 25% control would effectively give him majority control.

eevee-fan · Jun 11, 2024

Cosmacelf said:
Actually he was counting on them so that his 25% control would effectively give him majority control.

So as he gets more unhinged, he will demand more and more shares until he owns 51% of Tesla? LOL

If he cannot get 51% of the shareholders to see things his way, maybe it is time for him to go.

DrChaos · Jun 11, 2024

heltok said:
Imo Tesla are not lacking for applications to their job listings.

It's never a numerical issue. The issue is the type of people applying for those. History shows by far the best don't work at an environment with aggressive nasty leaders. They work at Bell Labs type or Los Alamos Labs with real scientists running the place and insulating the creators from foolish external demands.

FAIR is doing great with Yann LeCun setting the environment and attitude because he understands it.

With someone else as CEO they might be able to get Karpathy back and Ilya Sutskever or people at their level to run robot labs at a deep level.

"Grok" from xAI isn't doing anything deep or significant as Elon care more about being a snarky teenager and "anti-woke", his personal emotional agitations, than discoveries and advancements. Like Twitter, it is a pure example of 100% Musk-circa-2024 management directives on AI and product design. It's not appealing to major minds.

eevee-fan said:
So as he gets more unhinged, he will demand more and more shares until he owns 51% of Tesla? LOL

If he cannot get 51% of the shareholders to see things his way, maybe it is time for him to go.

That's the issue, Musk turning into Howard Hughes while maintaining a complete lock on management and the future. It's really dangerous and would lower US GDP and power substantially to lose Tesla. It's a national asset, like SpaceX.

Tiger · Jun 11, 2024

DrChaos said:
"Grok" from xAI isn't doing anything deep or significant as Elon care more about being a snarky teenager and "anti-woke", his personal emotional agitations, than discoveries and advancements. Like Twitter, it is a pure example of 100% Musk-circa-2024 management directives on AI and product design. It's not appealing to major minds.

Appeals to the next generation who will be in the drivers' seat in the future (fsd pun intended).

DrChaos said:
That's the issue, Musk turning into Howard Hughes while maintaining a complete lock on management and the future. It's really dangerous and would lower US GDP and power substantially to lose Tesla. It's a national asset, like SpaceX.

Let's get back on topic?

heltok · Jun 11, 2024

DrChaos said:
FAIR is doing great with Yann LeCun setting the environment and attitude because he understands it.

https://twitter.com/x/status/1797347030466953272

heltok · Jun 13, 2024

I guess when Elon is talking about using 100M cars with HW4+ as a massive cluster it pretty much is this but scaled up:

https://twitter.com/x/status/1801322369434173860

Cosmacelf · Jun 14, 2024

heltok said:
https://twitter.com/x/status/1797347030466953272

Not really. There’s a four year old benchmark that five year old humans can do that the best LLMs cannot. And now there’s a $1M prize for beating it.

ARC Prize

ARC Prize is a $1,000,000+ nonprofit, public competition to beat and open source a solution to the ARC-AGI benchmark.

arcprize.org

DrChaos · Jun 14, 2024

heltok said:
https://twitter.com/x/status/1797347030466953272

He said that AI will not learn spatial reasoning purely from text and he is correct.

The LLMs can sort of guess at some proxy to spatial reasoning when expressed through text on simple examples---and the GPT-4 system is opaque how it is trained but it is not only on text.

LeCun does not believe at all that "AI" can't ever learn this fully, but that autoregressive language modeling on text input alone can't. Recall that the LLMs can fake lots of performance as they know test cases and things like it---as they can often do well on medical questions covered by published text but not on novel original concepts.

LeCun's research program is to fix this problem and he's foresighted and in my opinion, correct. He's working on non-autoregressive concepts (where prediction is more like an optimization theory based relaxation to optimum operator of a whole solution rather than one tiny bit at at time) and obviously spatial and physical reasoning connected to embodied sensors, like the natural physical intelligence that a fish or bee has, which evolution has found a very low energy use and small neuron solution.

There needs to be new mathematical and conceptual discoveries in AI to get there.

No, don't dismiss LeCun as some sort of doddery fool.

DrChaos · Jun 14, 2024

heltok said:
I guess when Elon is talking about using 100M cars with HW4+ as a massive cluster it pretty much is this but scaled up:

https://twitter.com/x/status/1801322369434173860

High latency low reliability network connections aren't so useful for very large scale training. There's a good reason the highly intimately linked supercomputer NVidia boards binding coprocessors very tightly and other connections binding boards closely are preferred and most useful. This means high performance supercomputer purpose designed, not loose conglomerations of cheap consumer hardware. This is hard core NVidia tech and extremely difficult to implement.

Stochastic Gradient Descent (and variants that everyone uses to train) is intrinsically serial because you want to score and backprop the new examples starting from the model parameters which have been previously updated. You can try to distribute some gradient updates and add them which does work partially but it is still lowering training performance when they're based on where the model used to be some number of steps ago, and they aren't synchronized.

In most measures the model train performance is proportional to the number of gradient updates you can do, meaning serial gradient updates. Wide distribution is increasing the minibatch size (and adding some noise in addition) and possible latency and dead time, but increasing minibatch size doesn't improve performance after some point vs getting more updates.

Given a trillion examples to get through the tradeoff of more gradient updates on smaller batches (up to some point) is preferred vs the limit of of course one highly distributed batch of a trillion examples with gradients all added up.

Even more clear example: weather prediction & fluid mechanics. You can't susbtitute 100x the comptue power predicting weather in more detail starting from "now" vs evolving the prediction forward in physical time and predicting future small steps beyond that, theres a reason supercomputing needs very low latency high reliability distribution right nearby.

In a nutshell, Elon is bullshitting and trying to sell cars and equity price by implying those 100M cars will have any significant use for training. That's a "No". Same with Apple if they're implying the same.

His newly acquired $56 billion on the other hand which he took from himself away from the shareholders could have been a capital raise for the company, and the shareholders would have owned and his scientists could have bought and used some very powerful and actually useful training supercomputers.
Tesla - Elon + $56 billion capital >> Tesla + Elon - 56 billion.

You dont see Zuckerberg demanding more free equity from the board for himself even though by funding Pytorch for free he's done far more for AI than any other businessman in truth. And TBH NVidia owes Meta for that too by making NVidia the most preferred target & best supported.

eevee-fan · Jun 14, 2024

New algorithm discovers language just by watching videos

DenseAV, developed by MIT and Google researchers, learns to parse and understand the meaning of language just by watching videos of people talking, with potential applications in multimedia search, language learning, and robotics.

news.mit.edu

Let's train it by letting it watch The Matrix!

heltok · Jun 14, 2024

DrChaos said:
High latency low reliability network connections aren't so useful for very large scale training.

But for very large scale inference.

DrChaos · Jun 14, 2024

heltok said:
But for very large scale inference.

Potentially but distributed parked computers don't have that much value. The cases it could handle are limited---and useless unless there is at least high reliable download bandwith. The tasks have to be trivially paralellizable.

The value would be large scale scoring/evaluation of models against test data, presumably video. The compute provider on car would have to download a multi-gigabyte model and then stream high-bandwith video continuously and then send results back up to the mothership.

If I were on this program I woudn't bother---the software engineering and development necessary, as well as the social and marketing noise necessary to keep this running smoothly, would be fairly substantial and for minimal gain. On the other hand, competitive inference hardware from a chip maker, e.g. one competing against NVidia and desperate for some business, like AMD or Intel, could be installed in the datacenter colocated with training, and with tons of bandwidth from the train computer. Possibly even Tesla's own custom on-board inference hardware if it makes sense to check how it would perform.

The scientist and developers could immediately score models with similar data right near the datastorage and with 10 GB/s or more I/O bandwith, all local, a few dozen nanoseconds away.

Human labor is expensive---inference matmuls are going to be provided by fiercely competitive chipmakers who can't dethrone NVidia on train. All that effort all to make inference chips cheaper and faster and energy efficient---Tesla should take advantage as a volume buyer. Hacking it up on people's owned cars is like something the USSR would have to do, being flush with labor and poor with capital and chips. Tesla is the opposite---spend money on chips once there is a high competitive market.

Inference on customer owned cars is not going to be something significantly useful or cost-effective given the dev and operations costs. Unless it's purely for marketing and meant to make people feel good about their purchases and will encourage more sales. I think that's a stretch.

heltok · Jun 14, 2024

DrChaos said:
Potentially but distributed parked computers don't have that much value. The cases it could handle are limited---and useless unless there is at least high reliable download bandwith.

Yeah, would be great if Elon had a reliable internet provider with good download bandwidth.

Search

Artificial Intelligence

DrChaos

Active Member

heltok

Active Member

x.com

eevee-fan

Active Member

Cosmacelf

Well-Known Member

eevee-fan

Active Member

DrChaos

Active Member

Tiger

Active Member

heltok

Active Member

heltok

Active Member

Cosmacelf

Well-Known Member

ARC Prize

DrChaos

Active Member

DrChaos

Active Member

eevee-fan

Active Member

New algorithm discovers language just by watching videos

heltok

Active Member

DrChaos

Active Member

heltok

Active Member

Similar threads