본문 바로가기
IT & Tech 정보

Korean ‘Sovereign AI’ Project and the “From Scratch” Controversy

by 지식과 지혜의 나무 2026. 1. 5.
반응형


Introduction

Figure: Naver Cloud’s booth at the Independent AI Foundation Model project presentation event (Dec 30, 2025). South Korea’s government launched this initiative to foster homegrown “sovereign AI” models. The project has sparked debate over what it means to develop an AI model “from scratch” using 100% indigenous technology. In late 2025, five teams (Naver Cloud, Upstage, SK Telecom, NC AI, and LG AI Research) were chosen as finalists to build Korea’s representative AI model. A key evaluation criterion set by the government was that models be built from the ground up (“from scratch”) rather than fine-tuned from existing foreign models  . However, two high-profile controversies have since arisen, with Korean teams being accused of leveraging Chinese open-source AI models. These incidents – involving startup Upstage and tech giant Naver – highlight challenges in defining “from scratch” and underscore the growing influence of Chinese AI technology.

The “From Scratch” Criterion in Korea’s AI Initiative

South Korea’s Ministry of Science and ICT emphasized “from scratch” development as a major factor in its Independent AI Foundation Model project . From scratch traditionally means starting with nothing – collecting data, designing architecture, and training a model from randomly initialized weights, without using or fine-tuning an existing model . The intent is to ensure the resulting AI is truly indigenous (sometimes dubbed a “sovereign AI”). This was seen as crucial for a national AI champion project, to avoid simply repackaging foreign AI models . In practice, however, the concept proved ambiguous. AI industry experts noted that “from scratch” lacks a universally accepted definition, and companies have sometimes misused the term for marketing even when their models were fine-tuned on existing bases . As one AI researcher observed, the criteria for originality were not clearly specified by the government, leading to confusion . The government responded by planning rigorous verification: teams must submit final model files and intermediate training checkpoints for inspection by a specialized agency (TTA) to confirm that each model aligns with the from-scratch development claims  .

Upstage’s SOLAR Model vs. ZhipuAI’s GLM: Plagiarism Scare

The first controversy erupted around Upstage, a Korean AI startup, and its 100-billion-parameter model “SOLAR Open 100B.” On January 1, 2026, an independent AI developer publicly accused Upstage of copying a Chinese model . The claim was that Upstage’s model was built on ZhipuAI’s GLM (a model from a Chinese AI startup), citing technical similarities . In particular, the accuser pointed out that certain components of SOLAR – such as the model architecture (e.g. layer normalization structure) – appeared very similar to Zhipu’s GLM, and that the distribution of token embeddings was “statistically similar” between the two models . One reported comparison found a 96.8% cosine similarity in a specific layer’s weights, implying potential weight reuse . This was alarming because it suggested Upstage might have fine-tuned or directly reused parts of a Chinese model, contradicting the from-scratch mandate.

Upstage’s Response: Upstage swiftly organized a public verification session on January 2, 2026, to refute the allegations . CEO Kim Sung-hoon emphatically stated that “SOLAR Open is not a derivative of another AI model, but a from-scratch model… the entire model, including the layer norms, is new” . He explained that while the model’s structure may resemble others, this was because they utilized some open-source code (standard model implementation from Hugging Face) for efficiency . “The model architecture is standardized by open-source code on Hugging Face, but the weights were trained anew following a from-scratch approach,” Kim noted, backed by training logs showing the model’s loss starting high and then decreasing – evidence of learning from initialization . The specific layer norm similarity that sparked suspicion was only 0.0004% of the entire model, a tiny fraction, which Kim attributed to a “statistical fluke”  . He argued that cosine similarity alone was a misleading indicator – since many independent transformer models share similar layer norm properties by design, a high cosine similarity in one small component can occur by chance . When analyzed with a more robust metric (Pearson correlation), Upstage demonstrated virtually no correlation between SOLAR’s weights and the Chinese model’s, strongly indicating that SOLAR’s weights were original  .

Industry experts sided with Upstage. They noted that reusing open-source code or standard architectures does not equate to plagiarism – what matters is whether the model’s parameters were independently learned . “Source code can be borrowed without it being plagiarism. Upstage constructed its weights independently,” affirmed one AI executive, underscoring that Upstage did not copy another model’s weights . It also emerged that Upstage’s team had referenced a public “GPT-OSS 120B” blueprint (a globally used open architecture) in early development, but had significantly modified it for their needs (adding a shared layer, removing an inefficient component present in the Chinese GLM, etc.) . Upstage did acknowledge a minor oversight: some open-source inference code from ZhipuAI’s GLM had been used for compatibility, and they had initially missed crediting it in the license – a mistake they promptly corrected . This clarified that only code (not model weights) was borrowed, and per Apache 2.0 license rules they added attribution .

The outcome was a vindication for Upstage. The accuser publicly apologized on January 3 for causing confusion with an unverified claim, effectively retracting the accusation . The controversy highlighted the need for clearer standards: even though Upstage’s model was indeed from scratch in terms of training, the lack of a precise definition meant that use of standard open-source components had been misconstrued . Observers noted that the debate had shifted from “did they plagiarize?” to “how do we define an AI model’s independence?”  . This incident demonstrated that Chinese AI models like ZhipuAI’s GLM are now strong enough to be mistaken as the basis for others, and it foreshadowed further debates about leveraging open models in sovereign AI efforts.

Naver’s HyperCLOVA X and Alibaba’s Qwen: Open-Source Encoders Debate

Just days later, on January 5, 2026, a new controversy surfaced — this time involving Naver Cloud, one of Korea’s tech giants. Naver’s model, called HyperCLOVA X, was under scrutiny for not being entirely built from scratch. At the project’s first public showcase (Dec 30, 2025), Naver had unveiled HyperCLOVA X Seed 8B Omni and Seed 32B Think – lightweight multimodal AI models integrating text, image, and audio understanding  . However, an analysis of Naver’s 32B model (the “Think” variant) revealed that parts of it were essentially borrowed from Alibaba’s open-source model Qwen 2.5. According to industry reports, the vision encoder in HyperCLOVA X showed a cosine similarity of 99.51% and a Pearson correlation of 98.98% with the encoder of Alibaba’s Qwen 2.5 model . Pearson correlation is a stringent measure of actual value alignment (beyond mere directional similarity), so such high numbers strongly indicate the weights are nearly identical  . In other words, Naver appears to have taken Qwen 2.5’s pre-trained image encoder and only slightly fine-tuned it for their model . Even more striking, the audio encoder in HyperCLOVA X was found to be used outright without fine-tuning – essentially a direct plug-in from Qwen  .

This analysis implies that Naver’s “from-scratch” foundation model actually relied on Chinese open-source technology for two major components: image understanding and speech processing . The revelation stirred debate because it challenged whether Naver’s model met the project’s spirit of full self-reliance. By conventional definition, incorporating a pretrained module (even an open-source one) is not building from scratch . As AI professor Jang Doo-sung remarked, “Most companies do have a practice of using well-made public vision transformers when building vision-language models, since image/audio tokenization needs far more data than text. But under the usual criteria, this wouldn’t be called ‘from scratch’.” . The rationale for reuse is practical – why reinvent the wheel for image and speech encoders if a high-quality open model exists? Still, it raised questions about technical sovereignty. Naver’s model effectively outsourced part of its “understanding” ability to a foreign (Chinese) model’s weights, which some argued could dilute claims of originality  .

Naver’s Explanation: Naver Cloud acknowledged that it adopted Qwen 2.5’s vision and audio encoders, but it defended the decision as strategic and transparent . Naver stressed that the core of a foundation model is the inference engine – the “brain” that interprets inputs and generates answers  . That core language model (the billions of parameters that handle Korean language and reasoning) was 100% developed with Naver’s own technology from scratch, giving HyperCLOVA X its unique identity and “thought process” . The encoders for vision and speech, Naver argued, are peripheral input modules – akin to “eyes and ears delivering signals to the brain” – and thus not part of the model’s fundamental intellectual core  . In Naver’s view, using a proven encoder doesn’t compromise the model’s sovereignty, because the “thinking and identity” reside in the main model which is homegrown  .

Naver further explained that it chose Qwen’s encoders for practical reasons: to ensure compatibility with the latest global standards and to optimize the overall system . Building a high-quality image or audio encoder from scratch requires enormous data and time, and given that well-optimized modules are readily available, Naver made a “strategic decision” to incorporate them . “It’s not due to lack of our own capability,” a Naver official noted, “but a high-level engineering judgment to use standardized, high-performance modules to enhance the model’s completeness and stability.”  . Naver also pointed out that such reuse is common in the global AI industry – even Alibaba’s Qwen itself wasn’t built in a vacuum. For example, Alibaba’s Qwen-Audio module was built on OpenAI’s speech recognition technology, and Qwen-Omni (a multimodal variant) leveraged Google’s image recognition technology . In other words, leading AI labs often integrate each other’s innovations (when open-source or licensed) to advance the state of the art. By referencing these cases, Naver positioned its use of Qwen encoders as being in line with global best practices, rather than a shortcut unique to them .

Crucially, Naver claimed it has been transparent about this approach. The company said it documented the use of Qwen encoders in its technical report and model card, and respected open-source license requirements  . All such details were openly disclosed on platforms like Hugging Face, and Naver denied any intent to misrepresent its technology contributions . In fact, Naver argued that focusing solely on whether “every piece was built in-house” misses the point. They suggested that a more important measure of innovation is how creatively and effectively one integrates various components to deliver a powerful AI system  . “The hardest challenge in multimodal AI is not the origin of each part, but designing a unified architecture that can simultaneously understand and generate text, speech, and images in an organic way,” Naver explained . By that logic, Naver’s true achievement – integrating the vision, audio, and language subsystems into a coherent whole – remains an engineering feat, regardless of using external encoders. Nonetheless, the incident fueled debate in Korea about whether the government’s project goals were undermined. Critics wondered if Naver’s approach, however pragmatic, conflicted with the initiative’s push for “100% self-reliant AI.” The government’s evaluation panel is expected to consider this nuance – possibly deducting some points for not fully adhering to a pure from-scratch ideal, though officials had never explicitly forbidden using open-source components  . As Prof. Jang noted, the project did not rigidly define from-scratch in its rules, so Naver’s submission isn’t a “lie” – it’s now up to the government evaluators to decide how to weigh such cases  .

Implications: Chinese AI’s Rising Influence

These controversies reveal a broader trend: Chinese AI models have progressed to world-class levels, to the point that even foreign AI projects find them attractive to emulate or utilize. In the past, Chinese tech was often seen as lagging or copying Western advances, but the tables have turned in AI. Now Chinese-developed models are among the best, and their open-source releases are shaping the global ecosystem. Alibaba’s Qwen model is a prime example. Qwen (openly released under Apache 2.0 license) has rapidly become a cornerstone in the open LLM community  . Its latest versions are highly sophisticated – Qwen 2.5 (launched late 2024) scaled up to 110B parameters (dense) or 72B (instruction-tuned) and was trained on an immense 18 trillion tokens . In fact, on Chinese benchmarks like SuperCLUE, the 72B Qwen2-Instruct model ranked just behind OpenAI’s GPT-4 and Anthropic’s Claude 3.5, while outperforming all other Chinese models as of mid-2024 . This places Qwen among the top-tier AI models in the world in terms of capability. Moreover, Alibaba has not only one model but an entire suite: they introduced multimodal Qwen-VL models (vision-language) in sizes from 3B to 72B, and even domain-specific versions like Qwen-Audio (for speech) and Qwen-Math  . As of early 2025, Alibaba had released over 100 open-weight models in the Qwen family and beyond, which collectively have been downloaded more than 40 million times  – a clear indicator of their widespread adoption.

It is telling that Naver chose Qwen as the source for its encoders. Qwen’s vision and audio components are obviously strong performers, and by using them, Naver implicitly acknowledged the quality of Chinese AI research. Likewise, the fact that Upstage was suspected (wrongly) of using ZhipuAI’s GLM highlights how advanced Chinese models have become – a Korean model’s output or design being too good led some to assume it must have come from China. Chinese AI labs, from tech giants like Alibaba and Baidu to startups like ZhipuAI and Baichuan, have invested heavily and caught up in many areas of AI, including large language models and multimodal AI. They also often open-source their models, which accelerates global diffusion of their technology. For example, ZhipuAI’s GLM-130B (released 2022) was one of the earliest open 100B+ models, and its newer GLM-4.5 series continued to push boundaries. Baichuan’s 53B and other Chinese models have also been open-sourced with competitive performance. This open culture means that today Western, Chinese, and other researchers are cross-pollinating: Meta’s LLaMA inspired many Chinese models; Chinese models like Qwen are now inspiring others.

The user’s remark that “now the benchmark of AI is how well one can copy and utilize Chinese tech” might be slightly tongue-in-cheek, but it reflects a real shift. Chinese AI capability is now world-class, and leveraging it can be a smart strategy. Rather than “copying,” one might say integrating the best open models – whether they originate from the US (e.g. Meta’s LLaMA), China (Alibaba’s Qwen), or elsewhere – has become a hallmark of cutting-edge AI development. In essence, AI has no borders in technical terms: a truly advanced AI product may incorporate components from multiple sources. The South Korean cases show a reversal of narratives; it used to be Chinese firms replicating Western advances, but here we see a Korean project drawing on Chinese advances. This is a symbolic testament to China’s rise in AI prowess, which is “on a completely different level than in the past.”

Conclusion

The “from scratch” controversies in South Korea’s flagship AI project underline the tension between nationalistic tech goals and the realities of open-source AI development. On one hand, the government wants a sovereign AI built with domestic expertise. On the other hand, AI progress often comes from building on existing work – whether open-source models or standard architectures. The Upstage and Naver episodes demonstrate that using open components (code or even weights) can dramatically speed up development, but they also blur the line of what counts as “own technology.” In the case of Upstage, the team showed that one can adhere to the letter of from scratch (training new weights) while still benefiting from open-source code. In Naver’s case, they pushed the envelope by including actual pretrained weights from a foreign source, arguing it was a justifiable engineering choice. Neither team acted illegally or secretly – both leveraged openly available resources and were transparent when questioned .

Moving forward, this has prompted calls to refine the definition of “independent AI”. Rather than a black-and-white rule, it may become a spectrum: for instance, a model could be mostly from-scratch in its core, yet partially built on open modules. The Korean government’s evaluation (with thorough checks of training data and weight origins)  will set an important precedent. If Naver’s approach is accepted with minimal penalty, it suggests a more pragmatic stance – valuing innovation in how components are assembled, not just invented. If it’s heavily penalized, it reinforces a strict interpretation that even auxiliary parts must be homegrown to claim the title of “sovereign AI.” Either way, the discussion has highlighted the global interconnectedness of AI development. Chinese open-source models now stand as peers to Western ones, and savvy AI builders will continue to use the best tools available – whether those originate from Silicon Valley or Beijing. In the end, completely isolating a “national AI” from outside influence may be neither feasible nor wise; the focus could shift to how creatively and effectively a nation’s AI teams can blend external innovations with their own to create something truly competitive  . The measure of success might not be absolute technological purity, but the ability to absorb, improve, and integrate global AI advances into unique solutions. South Korea’s experiment will provide valuable lessons in this balancing act for the AI community worldwide.

Sources: The analysis above references reporting and expert commentary from News1  , ZDNet Korea   , and Korea JoongAng Daily (Mijoong)  , as well as background on Alibaba’s Qwen from Wikipedia and other sources  .

반응형