IL0-786 certification - Designing Flexible Wireless LAN Solutions Updated: 2023
|Pass4sure IL0-786 practice tests with Real Questions|
Exam Code: IL0-786 Designing Flexible Wireless LAN Solutions certification November 2023 by Killexams.com team|
|Designing Flexible Wireless LAN Solutions|
Intel Designing certification
Other Intel examsIL0-786 Designing Flexible Wireless LAN Solutions
|We are doing great struggle to provide you real IL0-786 dumps and practice test. Each IL0-786 question on killexams.com has been Verified and updated by our team. All the online IL0-786 dumps are tested, validated and updated according to the IL0-786 course.|
Designing Flexible Wireless LAN Solutions
A customer has a DHCP server on the network and wants the Intel PRO/Wireless Access
Point to be automatically assigned an P address from this server. What must you do to
allow this capability on a new installation?
A. You use the serial interface and enable DHCP.
B. You first assign a static IP and then reboot the Access Point to enable DHCP.
C. No configuration is required. DHCP is the default configuration.
D. There is nothing you can do. DHCP is not supported on the Access Point.
Which three attenuate RF signals the most? (Choose three.)
D. paper rolls
What is WEP?
A. Wired Equivalent Privacy
B. Windows Equivalent Privacy
C. Wireless Equivalent Privacy
D. Windows Equivalent Protection
Wireless infrastructure networking requires which two parameters? (Choose two.)
A. ESS ID
B. hop sequence
C. encryption level
D. access control list
What is the first level of security encountered when managing an AP?
B. 40-bit encryption
C. 128-hit encryption
D. system password
What are two methods of loading an ACL on an AP? (Choose two.)
A. load an ACL from a file using Xmodem
B. Telnet to the AP, then load an ACL file
C. manually enter individual MAC addresses
D. manually enter a range of MAC addresses
MobileUnit 2 are out of range of each other.MobileUnit 1 and MobileUnit 2 transmit at the
same time and a collision occurs.What is this issue called?
B. skin effect
C. hidden node problem
D. Differentiated Inter-Frequency Stability
What do the periods in front of certain configuration options indicate?
A. The option can be saved to all APs with the same subnet.
B. The option can be saved to all APs with the same ESSID.
C. The option can be saved to all APs on the same subnet and same firmware revision.
D. The option can be saved to all APs with the same ESSID and same firmware revision.
What are two ways to deny access to a network through an AP? (Choose two.)
For More exams visit https://killexams.com/vendors-exam-list
Kill your exam at First Attempt....Guaranteed!
AI is changing processor design in fundamental ways, combining customized processing elements for specific AI workloads with more traditional processors for other tasks.
But the tradeoffs are increasingly confusing, complex, and challenging to manage. For example, workloads can change faster than the time it takes to churn out customized designs. In addition, the AI-specific processes may exceed the power and thermal budgets, which may require adjustments in the workloads. And integrating all of these pieces may create issues that need to be solved at the system level, not just in the chip.
“AI workloads have turned processor architecture on its head,” said Steven Woo, fellow and distinguished inventor at Rambus. “It was clear that existing architectures didn’t work really well. Once people started realizing back in the 2014 timeframe that you could use GPUs and get tremendous gains in trading performance, it gave AI a massive boost. That’s when people started saying, ‘A GPU is kind of a specialized architecture. Can we do more?’ It was clear back then that multiply accumulates, which are very common in AI, were the bottleneck. Now you’ve got all this great hardware. We’ve got the multiply accumulate stuff nailed. So what else do we have to put in the hardware? That’s really what architecture is all about. It’s all about finding the tall peg or the long tent pole in the tent, and knocking it down.”
Others agree. “AI just lends itself to GPU architecture, and that’s why NVIDIA has a trillion dollar market cap,” said Ansys director Rich Goldman. “Interestingly, Intel has been doing GPUs for a long time, but inside of their CPUs to drive the video processors. Now they’re doing standalone GPUs. Also, AMD has a very interesting architecture, where the GPU and CPU share memory. However, the CPU is still important. NVIDIA’s Grace Hopper is the CPU-GPU combination, because not everything lends itself to a GPU architecture. Even in applications that do, there are parts that run just small CPUs. For decades, we’ve been running everything on a CPU x86 architecture, maybe RISC architecture, but it’s a CPU. Different applications run better on different architectures, and it just happened that NVIDIA focused first on video gaming, and that translated into animation and movies. That same architecture lends itself very well to artificial intelligence, which is driving everything today.”
The challenge now is how to develop more efficient platforms that can be optimized for specific use cases. “When you implement this thing in real scalable hardware, not just one-off use cases, then the challenge then becomes how do you run this thing?” said Suhas Mitra, product marketing director for Tensilica AI Products at Cadence. “Traditionally in processors, we had a CPU. And if you had a mobile platform, you had a GPU, DSP, etc. All of this got rattled because people saw these workloads are sometimes embarrassingly parallel. And with the advent of parallel computing, which is the reason GPUs became very popular — they had very good hardware engines that could do parallel processing —the suppliers easily cashed in immediately.”
This works best when workloads are understood well-defined, said Sharad Chole, chief scientist at Expedera. “In those kinds of architectures, let’s say you are trying to integrate an ISP and NPU in a tightly coupled fashion in edge architectures. The SoC leads are looking into how they can reduce the area and power for the design.”
The challenge here is to understand the latency implications of the memory portion of the architecture, Chole said. “If an NPU is slow, what would the memory look like? When the NPU is fast, what would the memory look like? Finally, the questions between balancing the MACs versus balancing the memory comes from there where we are trying to reduce as much as possible for input and output buffering.”
External memory bandwidth is a key part of this, as well, particularly for edge devices. “No one has enough bandwidth,” he added. “So how do we partition the workload or schedule the neural network so that the external memory bandwidth is sustained, and is as low as possible? That’s basically something we do by doing packetization or breaking the neural network into smaller pieces and trying to execute both pieces.”
Designing for a rapidly changing future
“If you say you’re going to build a CPU that is really great at these LSTM (long short-term memory) models, that cycle is a couple of years,” said Rambus’ Woo. “Then you realize in two years, LSTM models came and went as the dominant thing. You want to do specialized hardware, but you have to do it faster to keep up. The holy grail would be if we could create hardware as fast as we could change algorithms. That would be great, but we can’t do that even though the industry is being pressured to do that.”
This also means the architecture of the processor handling AI workloads will be different than a processor that is not focused on AI workloads. “If you look at these engines for doing training, they’re not going to run Linux or Word, because they’re not designed for general-purpose branching, a wide range of instructions, or to support a wide range of languages,” Woo said. “They are is pretty much bare-bones engines that are built to go very fast on a small number of types of operations. They’re highly tuned to the specific data movement patterns required to do the computations. In the Google TPU, for example, the systolic array architecture has been around since the 1980s. It’s very good at doing a particular type of very evenly distributed work over large arrays of data, so it’s perfect for these dense neural networks. But running general-purpose code is not what these things are designed to do. They’re more like massive co-processors that do the really big part of the computation really well, but they still need to interface to something else that can manage the rest of the computation.”
Even the benchmarking is difficult, because it’s not always an apples-to-apples comparison, and that makes it hard to develop the architecture. “This is a hard Topic because different people use different tools to navigate this,” said Expedera’s Chole. “What this task looks like in the day-to-day of the design engineer is system-level benchmarking. Every part of the SoC you benchmark individually, and you’re trying to extrapolate based on those numbers what the bandwidth required is. ‘This is the performance, this is the latency I’m going to get.’ Based on that, you’re trying to estimate how the entire system would look. But as we actually make more headway during the design process, then we are looking into some sort of simulation-based approach where it’s not a full-blown simulation — like a transaction-accurate simulation within that simulation to get to the exact performance and exact bandwidth requirement for different design blocks. For example, there is a RISC-V and there is an NPU, and they have to work together and fully co-exist. Do they have to be pipelined? Can their workload be pipelined? How many exact cycles does the RISC require? For that, we have to compile the program on the RISC-V, compile our program on the NPU, then co-simulate that.”
Impact of AI workloads on processor design
According to Ian Bratt, fellow and senior director of technology at Arm, “PPA tradeoffs for ML workloads are similar to the tradeoffs all architects face when looking at acceleration – energy efficiency versus area. Over the last several years, CPUs have gotten significantly better at ML workloads with the addition of ML-specific acceleration instructions. Many ML workloads will run admirably on a modern CPU. However, if you are in a highly constrained energy environment then it may be worth paying the additional silicon area cost to add dedicated NPUs, which are more energy efficient than a CPU for ML inference. This efficiency comes at the cost of additional silicon area and sacrificing flexibility; NPU IP can often only run neural networks. Additionally, a dedicated unit like an NPU may also be capable of achieving a higher overall performance (lower latency) than a more flexible component like a CPU.”
Russell Klein, program director for Siemens EDA’s Catapult Software Division explained, “There are two major aspects of the design that will most significantly impact its operating characteristics, or PPA. One is the data representation used in the calculations. Floating point numbers are really quite inefficient for most machine learning calculations. Using a more appropriate representation can make the design faster, smaller, and lower power.”
The other major factor is the number of compute elements in the design. “Essentially, how many multipliers will be built into the design,” Klein said. “This brings parallelism, which is needed to deliver performance. A design can have a large number of multipliers, making it big, power hungry, and fast. Or it can have just a few, making it small and low power, but a lot slower. One additional metric, beyond power, performance, and area, that is very important is energy per inference. Anything that is battery powered, or that harvests energy, will likely be more sensitive to energy per inference than power.”
The numeric representation of features and weights can also have a significant impact of the PPA of the design.
“In the data center, everything is a 32-bit floating point number. Alternative representations can reduce the size of the operators and the amount of data that needs to be moved and stored,” he noted. “Most AI algorithms do not need the full range that floating point numbers support and work fine with fixed point numbers. Fixed point multipliers are usually about ½ the area and power of a corresponding floating point multiplier, and they run faster. Often, 32 bits of fixed point representation is not needed, either. Many algorithms can reduce the bit width of features and weights to 16 bits, or in some cases 8 bits or even smaller. The size and power of a multiplier are proportional to the square of the size of the data that it operates on. So, a 16-bit multiplier is ¼ the area and power of a 32-bit multiplier. An 8-bit fixed point multiplier consumes roughly 3% of the area and power as a 32-bit floating point multiplier. If the algorithm can use 8 bit fixed point numbers instead of 32-bit floating point, only ¼ the memory is needed to store the data and ¼ the bus bandwidth is needed to move the data. These are significant savings in area and power. By doing quantized aware training, the required bit widths can be reduced even further. Typically, networks that are trained in a quantized aware fashion need about ½ the bit width as a post training quantized network. This reduces the storage and communication costs by ½ and the multiplier area and power by ¾. Quantize aware trained networks typically require only 3-8 bits of fixed point representation. Occasionally, some layers can be just a single bit. And a 1 bit multiplier is an “and” gate.”
Also, when aggressively quantizing a network, overflow becomes a significant issue. “With 32 bit floating point numbers developers don’t need to worry about values exceeding the capacity of the representation. But with small fixed point numbers this must be addressed. It is likely that overflow will occur frequently. Using saturating operators is one way to fix this. Instead of overflowing, the operation will store the largest possible value for the representation. It turns out this works very well for machine learning algorithms, as the exact magnitude of a large intermediate sum is not significant, just the fact that it got large is sufficient. Using saturating math allows developers to shave and additional one or two bits off the size of the fixed point numbers they are using. Some neural networks do need the dynamic range offered by floating point representations. They simply lose too much accuracy when converted to fixed point, or require more than 32 bits of representation to deliver good accuracy. In this case there are several floating point representations that can be used. B-float16 (or “brain float”) developed by Google for their NPU, is a 16 bit float that is easily converted to and from traditional floating point. As with smaller fixed point numbers it results in smaller multipliers and less data storage and movement. There is also an IEEE-754 16 bit floating point number, and NVIDIA’s Tensorfloat,” Klein added.
Using any of these would result in a smaller, faster, lower power design.
Additionally, Woo said, “If you have a general-purpose core, it’s really good at doing a lot of things, but it won’t do any of them great. It’s just general. At any point in time when you’re doing your workload, there are going to be parts of that general-purpose core that are in use, and parts that are not. It takes area, it takes power to have these things. What people began to realize was Moore’s Law is still giving us more transistors, so maybe the right thing to do is build these specialized cores that are good at certain tasks along the AI pipeline. At times you will turn them off, and at times you’ll turn them on. But that’s better than having these general-purpose cores where you’re always wasting some area and power, and you’re never getting the best performance. Along with a market that’s willing to pay — a very high-margin, high-dollar market — that is a great combination.”
It’s also a relatively well understood approach in the hardware engineering world “You bring up version 1, and once you’ve installed it, you find out what works, what doesn’t, and you try and fix the issues,” said Marc Swinnen, director of product marketing at Ansys. “The applications that you run are vital to understand what these tradeoffs need to be. If you can make your hardware match the applications you want to run, you get much more efficient design than using off-the-shelf stuff. The chip you make for yourself is ideally suited to exactly what you want to do.”
This is why some generative AI developers are exploring building their own silicon, which suggests that in their eyes, even the current semiconductors are not good enough for what they want to do going forward. It’s one more of an example of how AI is changing processor design and the surrounding market dynamics.
AI also likely will play heavily into the chiplet world, where semi-custom and custom hardware blocks can be characterized and added into designs without the need to create everything from scratch. Big chipmakers such as Intel and AMD have been doing this internally for some time, but fabless companies are at a disadvantage.
“The problem is that your chiplets have to compete against existing solutions,” said Andy Heinig, department head for efficient electronics at Fraunhofer IIS’ Engineering of Adaptive Systems Division. “And if you’re not currently focusing on performance, you can’t compete. People are focused on getting this ecosystem up and running. But from our persepective, it’s a little bit of a chicken-and-egg problem. You need the performance, especially because the chips are more expensive than an SoC solution. But you can’t currently really focus on performance because you have to get this ecosystem up and running first.”
The right start
“It’s very important when these tradeoffs are happening to have a notion of what the goal is in mind,” said Expedera’s Chole. “If you just say, ‘I want to do everything and support everything,’ then you’re not really optimizing for anything. You’re basically just putting a general-purpose solution inside there and hoping it will meet your power requirement. That, in our understanding, has rarely worked. Every neural network and every deployment case on edge devices is unique. If your chip is going into a headset and running an RNN, versus sitting in a ADAS chip and running transformers, it’s a completely different use case. The NPUs, the memory systems, the configuration, and the power consumption are totally different. So it is very important that we understand what is the important set of workloads that we want to try. These can be multiple networks. You must get to the point that the team agrees on the networks that are important, and optimizes based on those. That’s missing when engineering teams are thinking about NPUs. They’re just thinking that they want to get the best in the world, but you cannot have the best without trading off something. I can supply you best, but in what area do you want the best?”
Cadence’s Mitra noted that everybody thinks about PPA in a similar way, but then people emphasize which part of the power, performance, area/cost (PPAC) they care about. “If you’re a data center guy, you may be okay with maybe sacrificing a little bit of area, because what you’re gunning for is very high-throughput machines because you need to do billions of AI inferences or AI things, which at one shot are trading market shares while running humongous models that lead to humongous amounts of data. Long gone are the days when you can think about a desktop running things for AI model development work for inferencing but even the inferencing for some of these large language models thing is getting pretty tricky. It means you need a massive data cluster and you need massive data compute on the data center scale at the hyperscalers.”
There are other considerations as well. “Hardware architectural decisions drive this, but the role of software is also critical,” said William Ruby, product management director for Synopsys’ EDA Group, noting that performance versus energy efficiency is key. “How much memory is needed? How would the memory subsystem be partitioned? Can the software code be optimized for energy efficiency? (Yes, it can.) Choice of process technology is also important – for all the PPAC reasons.”
Further, if power efficiency is not a priority, an embedded GPU can be used, according to Gordon Cooper, product manager for AI/ML processors at Synopsys. “It will supply you the best flexibility in coding, but will never be as power- and area-efficient as specialized processors. If you are designing with an NPU, then there are still tradeoffs to make in terms of balancing area versus power. Minimizing on-chip memory should significantly decrease your total area budget, but will increase data transfers from external memory, which significantly increases power. Increasing on-chip memory will decrease power from external memory reads and writes.”
“People look at the AI training part as, ‘Oh wow, that’s really computationally heavy. It’s a lot of data movement,'” said Woo. “And once you want to throw all this acceleration hardware at it, then the rest of the system starts to get in the way. For this reason, increasingly we’re seeing these platforms from companies like NVIDIA and others, which have elaborate AI training engines, but they also may have Intel Xeon chips in them. That’s because there is this other part of the computation that the AI engines just are not well suited to do. They’re not designed to run general-purpose code, so more and more this is a heterogeneous system issue. You’ve got to get everything to work together.”
The other piece of the puzzle is on the software side, which can be made more efficient through a variety of methods such as reduction. “This is the realization that within AI, there’s a specific part of the algorithm and a specific computation called a reduction, which is a fancy way of taking a lot of numbers and reducing it down to one number or small set of numbers,” Woo explained. “It could be adding them all together or something like that. The conventional way to do this is if you’ve got all this data coming from all these other processors, send it through the interconnection network to one processor, and have that one processor just add everything. All these numbers are going through this network through switches to get to this processor. So why don’t we just add them in the switch, because they’re all going by the switch? The advantage is it’s similar to in-line processing. What’s fascinating is that once you’re done adding everything in the switch, you only need to deliver one number, which means the amount of network traffic goes down.”
Architecture considerations like this are worth considering because they tackle several issues at once, said Woo. First, the movement of data across networks is incredibly slow, and that tells you to move the least amount of data possible. Second, it gets rid of the redundant work of delivering the data to a processor only to have the processor do all the math then deliver the result back. It all gets done in the network And third, it’s very parallel so you can have each switch doing part of that computation.
Likewise, Expedera’s Chole said AI workloads now can be defined by a single graph. “Having that graph is not for a small set of instructions. We are not doing one addition. We are doing millions of additions at once, or we are doing 10 million matrix multiplication operations at once. That changes the paradigm of what you are thinking about execution, how you’re thinking about instructions, how you can compress the instructions, how you can predict and schedule the instructions. Doing this in general-purpose CPU is not practical. There is too much cost to be able to do this. However, as a neural network, where the number of MACs that are active simultaneously is huge, the way you can generate the instructions, create the instructions, compress the instructions, schedule the instructions, changes a lot in terms of the utilization and bandwidth. That has been the big impact of AI on the processor architecture side.”
Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.
Training AI models is a whole lot faster in 2023, according to the results from the MLPerf Training 3.1 benchmark released today.
The pace of innovation in the generative AI space is breathtaking to say the least. A key part of the speed of innovation is the ability to rapidly train models, which is something that the MLCommons MLPerf training benchmark tracks and measures. MLCommons is an open engineering consortium focused on ML benchmarks, datasets and best practices to accelerate the development of AI.
The MLPerf Training 3.1 benchmark, included submissions from 19 vendors that generated over 200 performance results. Among the tests were benchmarks for large language model (LLM) training with GPT-3 and a new benchmark for training the open source Stable Diffusion text to image generation model.
“We’ve got over 200 performance results and the improvements in performance are fairly substantial, somewhere between 50% to almost up to 3x better,” MLCommons executive director David Kanter said during a press briefing.
The AI Impact Tour
Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!
LLM training gets an oversized boost that is beating Moore’s Law
Of particular note among all the results in the MLPerf Training 3.1 benchmark are the numbers on large language model (LLM) training. It was only in June that MLcommons included data on LLM training for the first time. Now just a few months later the MLPerf 3.1 training benchmarks show a nearly 3x gain in the performance of LLM training.
“It’s about 2.8x faster comparing the fastest LLM training benchmark in the first round [in June], to the fastest in this round,” Kanter said. “I don’t know if that’s going to keep up in the next round and the round after that, but that’s a pretty impressive improvement in performance and represents tremendous capabilities.”
In Kanter’s view, the performance gains over the last five months for AI training are outpacing what Moore’s Law would predict. Moore’s Law forecasts a doubling of compute performance every couple of years. Kanter said that the AI industry is scaling hardware architecture and software faster than Moore’s Law would predict.
“MLPerf is to some extent a barometer on progress for the whole industry,” Kanter said.
Nvidia, Intel and Google boast big AI training gains
Intel, Nvidia and Google have made significant strides in latest months that enable faster LLM training results in the MLPerf Training 3.1 benchmarks.
Intel claims that its Habana Gaudi 2 accelerator was able to generate a 103% training speed performance boost, over the June MLPerf training results using a combination of techniques including 8-bit floating point (FP8) data types.
“We enabled FP8 using the same software stack and we managed to Strengthen our results on the same hardware,” Itay Hubara, senior researcher at Intel commented during the MLCommons briefing. “We promised to do that in the last submission and we delivered.”
Google is also claiming training gains, with its Cloud TPU v5e which only became generally available on Aug. 29. Much like Intel, Google is using FP8 to get the best possible training performance. Vaibhav Singh, product manager for cloud accelerators at Google also highlighted the scaling capabilities that Google has developed which included the Cloud TPU multislice technology.
“What Cloud TPU multislice does is it has the ability to scale over the data center network,” Singh explained during the MLCommons briefing.
“With the multislice scaling technology, we were able to get a really good scaling performance up to 1,024 nodes using 4,096 TPU v5e chips,” Singh said.
Nvidia used its EOS supercomputer to supercharge training
Not to be outdone on scale, Nvidia has its own supercomputer known as EOS, which it used to conduct its MLPerf Training 3.1 benchmarks. Nvidia first spoke about its initial plans to build EOS back in 2022.
Nvidia reported that its LLM training results for MLPerf was 2.8 times faster than it was in June for training a model based on GPT-3. In an Nvidia briefing on the MLcommons results, Dave Salvator, director of accelerated computing products at Nvidia said that EOS has 10,752 GPUs connected via Nvidia Quantum-2 InfiniBand running at 400 gigabits per second. The system has 860 terabytes of HBM3 memory. Savator noted that Nvidia has also worked on improving software to get the best possible outcome for training.
“Some of the speeds and feed numbers here are kind of mind blowing,” Salvator said. “In terms of AI compute, it’s over 40 exaflops of AI compute, right, which is just extraordinary.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.
VMware’s do-it-yourself private AI program has two new partners, offering users more options for in-house generative AI training and deployments.
VMware has two new corporate partners, in the form of Intel and IBM, for its private AI initiative. The partnerships will provide new architectures for users looking to keep their generative AI training data in-house, as well as new capabilities and features.
Private AI, as VMware describes it, is essentially a generative AI framework designed to let customers use the data in their own enterprise environments -- whether public cloud, private cloud or on-premises -- for large language model (LLM) training purposes, without having to hand off data to third parties. VMware broke ground on the project publicly in August, when it rolled out VMware Private AI Foundation with NVIDIA. That platform combines the VMware Cloud Foundation multicloud architecture with Nvidia AI Enterprise software, offering systems built by Dell, HPE and Lenovo to provide on-premises AI capabilities.
Earlier this week, at the VMware Explore event in Barcelona, the company announced that it had expanded its partnership with Intel to offer that company's Max Series GPUs, Xeon processors, and AI software tools as an alternative to Nvidia. The Max Series GPU is designed to pack 128 of the company's most advanced AI-focused cores onto a given chip, while its latest Xeon CPUs boast the Intel Advanced Matrix Extension additions to its x86 instruction set -- also designed to maximize performance for AI and machine learning tasks.
"VMware and Intel are designing a VMware Private AI reference architecture that will enable customers to build and deploy Private AI models and reduce TCO by harnessing the Intel AI software kit, processors, and hardware accelerators with VMware Cloud Foundation," the company said in an official blog post.
VMware also announced a partnership with IBM, which will bring that company's watsonx AI framework to VMware customers, using Cloud Foundation and Red Hat's OpenShift containerization platform to provide a full-stack architecture for data management, governance and operational machine learning tasks.
"Additionally, organizations will be able to access IBM-selected open-source models from Hugging Face, as well as other third-party models and a family of IBM-trained foundation models to support GenAI use cases," the blog post said.
VMware, which is in the process of being acquired by Broadcom in a landmark $61 billion deal, has suffered significant headwinds in latest years, with its customer base eroding in the face of significant price increases and worsening support. Analysts from Forrester Research predicted late last month that 2024 will see VMware's customer base decline by as much as 20%.
Intel (NASDAQ:INTC) shares rose in pre-market trading as the company's chief executive made positive comments about its nascent foundry business, stating that its 18A advanced node is likely to go into test production early next year.
"For 18A, we have many test wafers coming out at this moment," Intel (INTC) Chief Executive Pat Gelsinger said at the company's Innovation Day in Taipei, according to Nikkei Asia. "The invention phase of the 18A is now complete, and now we're racing to production."
Gelsinger is transitioning Intel (INTC) to focus heavily on manufacturing chips for other semiconductor companies, positioning itself against foundry heavyweights such as Taiwan Semiconductor (NYSE:TSM), Samsung (OTCPK:SSNLF) and GlobalFoundries (GFS).
In its most latest quarter, Intel's (INTC) foundry segment saw revenue rise nearly 300% year-over-tear to $311M. That's well below the $17.28B that Taiwan Semiconductor (TSM) generated during the same period, which led to positive sentiment from Wall Street.
"Two and a half years into that journey and guess what? It's happening, we are on track to deliver five nodes in four years," Gelsinger said on Tuesday.
Gelsinger also said that Intel 3, its next-gen server and PC chips, is at the "debug" phase and also set to go into production in 2024.
The 62-year-old chieftain also weighed in on the rise of chip designs from Arm Holdings (ARM) moving to the PC market, where Intel has long dominated with its x86 architecture, and downplayed the threat, though he did supply kudos to Apple (AAPL). "They [Apple] control the ecosystem," Gelsinger said of the Mac-maker.
Gelsinger, who returned to Intel in 2021, added that Arm-based chips represent a large opportunity for Intel to boost its foundry business.
"Arm is now putting their leading-edge [chip] design on Intel 18A and finding very good power performance results from those designs," Gelsinger said, according to the news outlet.
"Every Arm [chip design] licensee, I want them to become [our] foundry customers going forward. As a result, capturing Arm's [IP base] opens up a lot of business opportunities for our foundry."
Researchers have used a supercomputer powered by Intel Corp. processors to run four 1 trillion-parameter language models simultaneously.
The chipmaker detailed the milestone at Supercomputing 2023, a major industry event taking place today in Denver. The supercomputer that researchers used to run the four language models is the U.S. Energy Department’s recently installed Aurora system. Alongside its announcement of the researchers’ achievement, Intel shared new details about its upcoming Gaudi-3 and Falcon Shores artificial intelligence chips.
Aurora was installed at the Energy Department’s Argonne National Laboratory earlier this year. It comprises more than 10,000 servers that feature about 21,000 central processing units and 60,000 graphics processing units from Intel. Once fully operational, Aurora is expected to rank as the world’s fastest supercomputer with more than two exaflops of performance.
Argonne National Laboratory, Intel and several other organizations have teamed up to use the system for AI development. The initiative aims to create generative AI models with more than one trillion parameters that can help speed up research projects. Engineers are training those models on datasets comprising text, code and scientific information.
At Supercomputing 2023 today, Intel disclosed that Aurora managed to run an AI model with one trillion parameters using only 64 of its 10,000-plus servers. Moreover, researchers managed to run four such models at the same time across 256 nodes. Each such node, which weighs 70 pounds, includes two Intel Xeon Max Series CPUs and no fewer than six Intel Max Series GPU graphics cards.
Next-generation AI chips
The Max Series GPUs in Aurora are based on an architecture called Xe HPC that Intel has developed in-house. Intel also offers a second AI processor, the Gaudi 2, that targets many of the same use cases. The Gaudi 2 (pictured) is based on a design that Intel obtained through its $2 billion acquisition of startup Habana Labs Ltd. in 2019.
Intel eventually plans to merge the two product lines into a single chip series based on a unified architecture. But before then, the company will launch an upgraded version of the Gaudi 2. As part of its Supercomputing 2023 presentation, the company shared new details about that upcoming chip.
Gaudi 3, as the processor is called, will reportedly be made using a five-nanometer process. Whereas its predecessor was implemented as a single piece of silicon, Gaudi 3 comprises two separate chiplets. Both Intel and its competitors are adopting a chiplet-based approach to building processors because it simplifies manufacturing in several respects.
One of the current-generation Gaudi 2’s main selling points is that it includes built-in Ethernet ports. This reduces the need for external networking hardware, which lowers costs. The Gaudi 3 will reportedly feature twice the networking capacity of its predecessor as well as 1.5 times more onboard memory for storing AI models’ data.
Thanks to Intel’s design upgrades, the Gaudi 3 is expected to provide four times the performance of its predecessor when crunching bfloat16 data. This is a specialized data format developed by Google LLC that many AI models use to store the information they process. The format’s popularity stems from the fact that it can help reduce the amount of memory that a neural network requires and speed up processing.
Intel plans to merge the Gaudi chip lineup with the Xeon Max GPU series, which powers the Aurora supercomputer, into a new product portfolio dubbed Falcon Shores. Both the Gaudi and Xeon Max GPU will provide forward-compatibility with the portfolio. That means AI models written for the two chip lines will also work on Falcon Shores silicon.
Intel detailed today that Falcon Shores chips will feature HBM3 memory, the latest iteration of the high-speed RAM included in many AI processors. HBM3 is faster than previous-generation hardware and uses less power. Falcon Shores products will also support oneAPI, an Intel technology that promises to reduce the amount of work involved in writing AI applications.
The third major focus of Intel’s Supercomputing 2023 announcements was its upcoming Emerald Rapids line of server CPUs. The chip series, which is set to launch next month, is based on the company’s 10-nanometer process. Intel released new performance data that indicates Emerald Rapids can provide a significant speed improvement over previous-generation silicon.
The most advanced CPU in the Emerald Rapids portfolio offers 64 cores. Compared with Intel’s fastest previous-generation chip, which features 56 cores, the new CPU can run AI speech recognition applications up to 40% faster. It demonstrated a similar speed advantage in a test carried out using the LAMMPS benchmark, which measures how fast a chip can carry out computational chemistry tasks.
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
Windows, Linux, FreeBSD all can update the microcode. I haven't checked other *BSD flavors, but if FreeBSD can, there's a good chance the others can too.
I'm not positive if Apple can on intel Macs, but I would be extremely surprised if it could not, since Apple's philosophy is very much one of they will update your hardware for you, so the OS 'just works' and is kept secure by their updates.
I guess if you are running something else like ReactOS or FreeDOS, maybe no?
But in that case, maybe just boot a Linux live OS from a USB or DVD, and let it update the cpu, then reboot back to your primary OS? EDIT: I wasn't sure if microcode updates are permanent and it turns out they are not - they have to be reloaded into the CPU with every reboot. So in that case, if you are running a niche OS that can't load the cpu microcode update every boot, you will need a uefi firmware that will do it for you during boot.
Congatec, a leader in embedded and edge computing technology, has launched a new series of ultra-rugged COM Express Compact Computer-on-Modules. These modules are powered by the 13th Gen Intel Core processors and are distinguished by their extreme ruggedness, capable of operating in harsh temperature ranges from -40°C to +85°C. These new CoM’s are specifically designed for challenging environments, featuring soldered RAM for enhanced shock and vibration resistance, complying with stringent railway standards. These modules are ideal for various applications, such as rail and off-road vehicles in sectors like mining, construction, agriculture, forestry, and other extreame condition applications. They also cater to stationary devices and outdoor applications that experience significant temperature fluctuations, emphasizing the need for critical infrastructure protection in scenarios like earthquakes and other mission-critical events.
The new modules have CPUs with up to 14 cores and 20 threads, supported by ultra-fast LPDDR5x memory, making them highly efficient for multitasking in outdoor and rugged environments. The incorporation of Intel's hybrid architecture, combining Performance-cores and Efficient-cores, enhances their performance within optimized power budgets. Additionally, these modules are supported by congatec's comprehensive ecosystem, which includes active and passive cooling solutions, optional conformal coating for environmental protection, and carrier board schematics. Services like real-time hypervisor technology, shock and vibration testing, temperature screening, and high-speed signal compliance testing, complemented by design-in services and training sessions, are also available, ensuring a seamless integration of congatec's advanced computing technologies.
COMPANY NEWS: At SC23, Intel showcased AI-accelerated high-performance computing (HPC) with leadership performance for HPC and AI workloads across Intel Data Centre GPU Max Series, Intel Gaudi 2 AI accelerators and Intel Xeon processors.
In partnership with Argonne National Laboratory, Intel shared progress on the Aurora generative AI (genAI) project, including an update on the one trillion parameter GPT-3 LLM on the Aurora supercomputer that is made possible by the unique architecture of the Max Series GPU and the system capabilities of the Aurora supercomputer.
Intel and Argonne demonstrated the acceleration of science with applications from the Aurora Early Science Program and the Exascale Computing Project. The company also showed the path to Intel Gaudi 3 AI accelerators and Falcon Shores.
“Intel has always been committed to delivering innovative technology solutions to meet the needs of the HPC and AI community. The great performance of our Xeon CPUs along with our Max GPUs and CPUs help propel research and science. That coupled with our Gaudi accelerators demonstrate our full breadth of technology to provide our customers with compelling choices to suit their diverse workloads,” said Intel corporate vice president and general manager of data centre AI solutions Deepak Patil.
Why it matters: Generative AI for science along with the latest performance and benchmark results underscore Intel’s ability to deliver tailored solutions to meet the specific needs of HPC and AI customers. Intel’s software-defined approach with oneAPI and HPC- and AI-enhanced toolkits help developers seamlessly port their code across architectural frameworks to accelerate scientific research. Additionally, Max Series GPUs and CPUs will be deployed in multiple supercomputers that are coming online.
About Generative AI for Science: Argonne National Laboratory shared progress on its generative AI for Science initiatives with the Aurora supercomputer. The Aurora genAI project is a collaboration with Argonne, Intel and partners to create state-of-the-art foundational AI models for science. The models will be trained on scientific texts, code and science datasets at scales of more than 1 trillion parameters from diverse scientific domains. Using the foundational technologies of Megatron with DeepSpeed, the genAI project will service multiple scientific disciplines, including biology, cancer research, climate science, cosmology and materials science.
The distinctive Intel Max Series GPU architecture and the Aurora supercomputer system capabilities can efficiently handle one trillion-parameter models with just 64 nodes, far fewer than would be typically required. Argonne National Laboratory ran four instances on 256 nodes, demonstrating the ability to run multiple instances in parallel on Aurora, paving the path to scale the training of trillions of parameter models more quickly with trillions of tokens on more than 10,000 nodes.
About Intel and Argonne National Laboratory: Intel and Argonne National Laboratory demonstrated the acceleration of science at scale enabled by the system capabilities and software stack on Aurora.1 Workload examples include:
• Brain connectome reconstruction is enabled at scale with Connectomics ML, showing competitive inference throughput on more than 500 Aurora nodes.
• General Atomic and Molecular Electronic Structure System (GAMESS) showed over 2x competitive performance with Intel Max GPU compared to the Nvidia A100. This enables the modelling of complicated chemical processes in drug and catalyst design to unlock the secrets of molecular science with the Aurora supercomputer.
• Hardware/Hybrid Accelerated Cosmology Code (HACC) has demonstrated runs on more than 1,500 Aurora nodes, enabling the visualisation and understanding of the physics and evolution of the universe.
• The drug-screening AI inference application, part of the Aurora Drug Discovery early science project (ESP), enables efficient screening of vast chemical datasets by enabling the screening of more than 20 billion of the most synthesised compounds on just 256 nodes.
Intel also showed new HPC and AI performance, as well as software optimisations across hardware and applications:
• Intel and Dell published results for STAC-A2, an independent benchmark suite based on real world market risk analysis workloads, showing great performance for the financial industry. Compared to eight Nvidia H100 PCIe GPUs, four Intel Data Centre GPU Max 1550s had 26% higher warm Greeks 10-100k-1260 performance and 4.3x higher space efficiency.
• The Intel Data Centre GPU Max Series 1550 outperforms Nvidia H100 PCIe card by an average of 36% (1.36x) on diverse HPC workloads.
• Intel Data Centre GPU Max Series delivers improved support for AI models, including multiple large language models (LLMs) such as GPT-J and LLAMA2.
• Intel Xeon CPU Max Series, the only x86 processor with high bandwidth memory (HBM), delivered an average 19% more performance compared to the AMD Epyc Genoa processor.
• Last week, MLCommons2 published results of the industry standard MLPerf training v3.1 benchmark for training AI models. Intel Gaudi2 demonstrated a significant 2x performance leap with the implementation of the FP8 data type on the v3.1 training GPT-3 benchmark.
o Intel will usher in Intel Gaudi3 AI accelerators in 2024. The Gaudi3 AI accelerator will be based on the same high-performance architecture as Gaudi2 and is expected to deliver four times the compute (BF16), double the networking bandwidth for greater scale-out performance, and 1.5x the on-board HBM memory to readily handle the growing demand for higher performance, high-efficiency compute of LLMs without performance degradation.
• 5th Gen Intel Xeon processors will deliver up to 1.4x higher performance gen-over-gen on HPC applications as demonstrated by LAMMPS-Copper.
o Granite Rapids, a future Intel Xeon processor, will deliver increased core count and built-in acceleration with Intel Advanced Matrix Extensions and support for multiplexer combined ranks (MCR) DIMMs. Granite Rapids will have 2.9x better DeepMD+LAMMPS AI inference. MCR achieves speeds of 8,800 mega transfers per second based on DDR5 and greater than 1.5 terabytes per second of memory bandwidth capability in a two-socket system, which is critical for feeding the fast-growing core counts of modern CPUs and enabling efficiency and flexibility.
About New Progress on oneAPI: Intel announced features for its 2024 software development tools that advance open software development powered by oneAPI multiarchitecture programming. New tools help developers extend new AI and HPC capabilities on Intel CPUs and GPUs with broader coverage, including faster performance and deployments using standard Python for numeric workloads, and compiler enhancements delivering a near-complete SYCL 2020 implementation to Strengthen productivity And code offload.
Additionally, Texas Advanced Computing Centre (TACC) announced its oneAPI Centre of Excellence will focus on projects that develop and optimise seismic imaging benchmark codes. Intel fosters an environment where software and hardware innovation and research advance the industry, with 32 oneAPI Centres of Excellence worldwide.
What’s next: Intel emphasised its commitment to AI and HPC and highlighted market momentum. New supercomputer deployments with Intel Max Series GPU and CPU technologies include systems like Aurora, Dawn Phase 1, SuperMUC-NG Phase 2, Clementina XX1 and more. New systems featuring Intel Gaudi2 accelerators include a large AI supercomputer with Stability AI as the anchor customer.
This momentum will be foundational for Falcon Shores, Intel’s next-generation GPU for AI and HPC. Falcon Shores will leverage the Intel Gaudi and Intel Xe intellectual property (IP) with a single GPU programming interface built on oneAPI. Applications built on Intel Gaudi AI accelerators, as well as Intel Max Series GPUs today will be able to migrate with ease to Falcon Shores in the future.
BARCELONA/VMware Explore 2023 - VMware, Inc. (NYSE: VMW) today announced a collaboration with Intel to extend the companies' more than two decades of innovation to help customers accelerate the adoption of artificial intelligence (AI) and enable private AI everywhere – across data centers, public clouds, and edge environments. VMware and Intel are working to deliver a jointly validated AI stack that will enable customers to use their existing general-purpose VMware and Intel infrastructure and open source software to simplify building and deploying AI models. The combination of VMware Cloud Foundation and Intel's AI software suite, Intel® Xeon® processors with built-in AI accelerators, and Intel® Max Series GPUs, will deliver a validated and benchmarked AI stack for data preparation, model training, fine-tuning and inferencing to accelerate scientific discovery and enrich business and consumer services.
More than 300,000 customers deploy VMware Cloud globally, and VMware virtualization software is deployed nearly everywhere in the enterprise where data is created, processed, or consumed. This makes VMware Cloud a fast means to bring AI-accelerated compute and models to wherever business gets done. Similarly, Intel offers open, scalable and trusted solutions to hundreds of thousands of customers. The ubiquity of VMware and Intel products in the enterprise is a powerful combination which will increase the accessibility of data science, enable organizations globally to adopt Private AI, an architectural approach that aims to balance the business gains from AI with the practical privacy and compliance needs.
“When it comes to AI, there is no longer any reason to debate trade-offs in choice, privacy, and control. Private AI empowers customers with all three, enabling them to accelerate AI adoption while future-proofing their AI infrastructure,” said Chris Wolf, vice president of VMware AI Labs.“VMware Private AI with Intel will help our mutual customers dramatically increase worker productivity, ignite transformation across major business functions, and drive economic impact.”
“For decades, Intel and VMware have delivered next-generation data center-to-cloud capabilities that enable customers to move faster, innovate more, and operate efficiently,” said Sandra Rivera, executive vice president and general manager of the Data Center and AI Group (DCAI) at Intel.“With the potential of artificial intelligence to unlock powerful new possibilities and Strengthen the life of every person on the planet, Intel and VMware are well equipped to lead enterprises into this new era of AI, powered by silicon and software.”
Boost AI Performance and get a More Secure AI Infrastructure with Lower TCO
VMware Private AI brings compute capacity and AI models to where enterprise data is created, processed, and consumed, whether in a public cloud, enterprise data center, or at the edge, in support of traditional AI/ML workloads and generative AI. VMware and Intel are enabling the fine-tuning of task specific models in minutes to hours and the inferencing of large language models at faster than human communication using the customer's private corporate data. VMware and Intel now make it possible to fine-tune smaller, economical state of the art models which are easier to update and maintain on shared virtual systems, which can then be delivered back to the IT resource pool when the batch AI jobs are complete. Use cases such as AI-assisted code generation, experiential customer service centers recommendation systems, and classical machine statistical analytics can now be co-located on the same general purpose servers running the application.
VMware and Intel are designing a reference architecture that combines Intel's AI software suite, Intel® Xeon® processors, and Data Center GPUs with VMware Cloud Foundation to enable customers to build and deploy private AI models on the infrastructure they have, thereby reducing total cost of ownership and addressing concerns of environmental sustainability. This VMware Private AI reference architecture with Intel AI will include:
VMware Cloud Foundation brings consistent enterprise-class infrastructure, operational simplicity, and enhanced security to VMware Private AI through capabilities such as:
VMware Private AI will be supported by servers from Dell Technologies, Hewlett Packard Enterprise and Lenovo running 4th Gen Xeon CPUs with Intel® Advanced Matrix Extensions (Intel® AMX) and Intel Max Series GPUs.
VMware is a leading provider of multi-cloud services for all apps, enabling digital innovation with enterprise control. As a trusted foundation to accelerate innovation, VMware software gives businesses the flexibility and choice they need to build the future. Headquartered in Palo Alto, California, VMware is committed to building a better future through the company's 2030 Agenda.
IL0-786 education | IL0-786 information source | IL0-786 test | IL0-786 information | IL0-786 study help | IL0-786 Topics | IL0-786 study tips | IL0-786 reality | IL0-786 syllabus | IL0-786 book |
Killexams exam Simulator
Killexams Questions and Answers
Killexams Exams List