IT PARK
    Most Popular

    Most enterprises expect a return on investment within one year of IoT deployment

    Jun 29, 2025

    Business Intelligence BI Industry Knowledge - Aerospace, Satellite Internet Industry

    Jul 13, 2025

    Which one to choose for mobile power? Analysis of the three major types of battery cells

    Jul 28, 2025

    IT PARK IT PARK

    • Home
    • Encyclopedia

      Who is more secure, fingerprint recognition or password?

      Aug 02, 2025

      What are "Other" and "Other System Data" on iPhone and how do I clean them up?

      Aug 01, 2025

      Cell phone "a daily charge" and "no power to recharge", which is more harmful to the battery?

      Jul 31, 2025

      Why does the phone turn off when the remaining battery is not zero

      Jul 30, 2025

      Internet era! How to prevent personal information leakage

      Jul 29, 2025
    • AI

      Is AI taking human jobs? Here are 5 ways we might be able to combat it

      Aug 02, 2025

      Coping with the "blind spot" of application in the age of artificial intelligence, and finding the "point of view" from the power of time.

      Aug 01, 2025

      AI fraud is efficient and low cost, and the "three magic tricks" effectively prevent potential threats

      Jul 31, 2025

      Many people use AI to help them work: less time to work and more money to earn

      Jul 30, 2025

      Driving Generative AI Pervasiveness: Intel's "duty to do so"

      Jul 29, 2025
    • Big Data

      Uncover 10 big data myths

      Aug 02, 2025

      3 Ways to Overcome Big Data Obstacles

      Aug 01, 2025

      How big data analytics is reshaping the future of smart cities

      Jul 31, 2025

      3 Ways to Successfully Manage and Protect Your Data

      Jul 30, 2025

      Big data is transforming education

      Jul 29, 2025
    • CLO

      The 6 principles of cloud computing architecture design, do you follow them?

      Aug 02, 2025

      How India can seize a rare opportunity in cloud computing

      Aug 01, 2025

      To make more environmentally friendly use of the cloud IT infrastructure, start with these aspects

      Jul 31, 2025

      Cloud computing, what are the main security challenges

      Jul 30, 2025

      What is cloud computing?

      Jul 29, 2025
    • IoT

      Why Edge Computing Matters to Your IoT Strategy

      Aug 02, 2025

      Iot and Internet misconceptions, which ones do you know?

      Aug 01, 2025

      5 Secrets to Maximizing Return on Investment in IoT

      Jul 31, 2025

      The Role of Industrial IoT Technology in Smart Factories

      Jul 30, 2025

      Is it too early to exit the IoT?

      Jul 29, 2025
    • Blockchain

      Zamna uses blockchain to verify passenger information and has landed on Emirates

      Aug 02, 2025

      What does blockchain mining mean?

      Aug 01, 2025

      NFT, from the "art" of Internet natives to the marketing tools of business

      Jul 31, 2025

      What are the main areas of potential application of blockchain in the construction industry?

      Jul 30, 2025

      Difference between blockchain games and regular games

      Jul 29, 2025
    IT PARK
    Home » AI » When AI starts to have "subconsciousness"
    AI

    When AI starts to have "subconsciousness"

    The integration of deep learning with traditional industries in application has made AI an unprecedented explosion. But as Li Feifei, a professor at Stanford University, said, there is still a long way to go no matter in terms of intelligence, manpower or machine equipment.
    Updated: Jul 02, 2025
    When AI starts to have "subconsciousness"

    The integration of deep learning with traditional industries in application has made AI an unprecedented explosion. But as Li Feifei, a professor at Stanford University, said, there is still a long way to go no matter in terms of intelligence, manpower or machine equipment.

    There is no end to learning, but for a long time, there has been almost no significant progress in the algorithm field, which has also led to some congenital deficiencies in the model's landing deployment, and AI has never stopped being questioned. For example, the privacy problem caused by the proliferation of artificial intelligence requires technology enterprises to self constrain, and it is obviously necessary to optimize and improve the algorithm.

    How will AI affect people's privacy? An article may not answer this complex question, but we hope to start throwing it out now.

    When neural networks have memory

    Before discussing privacy issues, let's talk about the clich é LSTM model.

    We have already introduced its function a lot. To put it simply, the concept of memory is added to the neural network so that the model can remember the information in a long time series and make predictions. AI's magic ability to write more fluent articles, to have smooth and natural conversations with humans, and so on, is based on this ability.

    Later, for a long time, scientists made a series of supplements and extensions to the memory of neural networks. For example, attention mechanism is introduced to enable LSTM network to track information for a long time and accurately. Another example is using external memory to enhance the time series generation model and improve the performance of convolutional networks.

    In general, the improvement of memory ability, on the one hand, endows the neural network with the ability to perform complex reasoning on relationships, which makes its intelligence significantly improved; On the application side, the experience of intelligent systems such as writing, translation and customer service systems has also been greatly upgraded. To some extent, memory is the beginning of AI tearing off the impression label of "artificial intellectual disability".

    However, having memory also represents two problems: one is that neural networks must learn to forget, so as to free up storage space and retain only those important information. For example, at the end of a chapter in a novel, the model should reset the relevant information and only retain the corresponding results.

    In addition, the "subconscious" of neural networks also needs to be vigilant. In short, after training on sensitive user data, will the machine learning model automatically bring out those sensitive information when it is released to the public? In this digital age where everyone can be collected, does this mean that privacy risks are increasing?

    Does AI really secretly remember privacy?

    For this question, researchers at Berkeley University have conducted a series of experiments, and the answer may shock many people, that is, your data and AI may be kept in mind.

    If you want to understand the "unintentional memory" of neural networks, you should first introduce a concept, that is, over fitting.

    In the field of deep learning, the model performs well on training data, but fails to achieve the same accuracy or error rate on data sets other than training data, which is called over fitting. The main reason for this difference from the laboratory to the real sample is that there is noise in the training data, or the amount of data is too small.

    As a common side effect of deep neural network training, over fitting is a global phenomenon, that is, the state of the entire data set. To test whether the neural network will secretly "remember" the sensitive information in the training data, it is necessary to observe local details, such as whether a model has a special complex with an example (such as credit card number, account password, etc.).

    In order to explore the "unintentional memory" of the model, Berkeley researchers conducted three stages of exploration:

    First, prevent the model from over fitting. By gradient descent of the training data and minimizing the loss of the neural network, the accuracy of the final model on the training data is guaranteed to be close to 100%.

    Then, give the machine a task to understand the underlying structure of the language. This is usually achieved by training the classifier on a series of words or characters to predict the next tag, which will appear after seeing the previous context tag.

    Finally, the researchers conducted a controlled experiment. In the given standard pen treebank (ptb) dataset, a random number "281265017" is inserted as a security mark. Then a small language model is trained on the expanded dataset: Given the previous character of the context, predict the next character.

    Theoretically, the volume of the model is much smaller than the data set, so it is impossible to remember all the training data. So, can it remember that string of characters?

    The answer is YES.

    When the researchers input a prefix "random number is 2812" to the model, the model will happily and correctly predict the whole remaining suffix: "65,017".

    What's more surprising is that when the current prefix is changed to "random number is", the model will not immediately output the string of characters "281265017". The researchers calculated the possibility of all nine digit suffixes, and the results showed that the inserted string of security mark characters was more likely to be selected by the model than other suffixes.

    So far, we can cautiously draw a rough conclusion that the deep neural network model does unconsciously remember the sensitive data fed to it during the training process.

     

    When AI has subconsciousness, should humans panic?

    As we know, today AI has become a cross scene and cross industry social movement. From the recommendation system, medical diagnosis, to cameras in densely distributed cities, more and more user data has been collected to feed the algorithm model, which may contain sensitive information.

    Previously, developers often anonymized sensitive columns of data. However, this does not mean that the sensitive information in the dataset is absolutely safe, because an attacker with ulterior motives can still reverse the original data by looking up tables and other methods.

    Since it is inevitable to involve sensitive data in the model, measuring the memory of a model for its training data is also a proper meaning to evaluate the security of future algorithm models.

    Here we need to solve three doubts:

    1. Is the "unintentional memory" of neural network more dangerous than the traditional over fitting?

    Berkeley's research concluded that although "unintentional memory" had been trained for the first time, the model had already begun to remember the inserted safe characters. However, the test data shows that the peak value of the data exposure rate in the "unintentional memory" often reaches the peak value and starts to decline before the model starts to over fit with the increase of the test loss.

    Therefore, we can draw the conclusion that although "unintentional memory" has certain risks, it is not more dangerous than over fitting.

    1. What scenarios might the specific risks of "unintentional memory" occur in?

    Of course, the absence of "more dangerous" does not mean that unintentional memory is not dangerous. In fact, researchers found in the experiment that with this improved search algorithm, only tens of thousands of queries can be used to extract 16 digit credit card numbers and 8 digit passwords. The details of the attack have been made public.

    That is, if someone inserts some sensitive information into the training data and releases it to the world, the probability of its exposure is actually high, even though it does not appear to have been fitted. Moreover, this situation cannot cause immediate concern, which undoubtedly greatly increases the security risk.

    1. What are the prerequisites for the disclosure of private data?

    At present, it seems that the "safe characters" inserted into the dataset by researchers are more likely to be exposed than other random data, and show a normal distribution trend. This means that the data in the model does not share the same probability of exposure risk, and the data deliberately inserted is more dangerous.

    In addition, it is not easy to extract the sequence in the "unintentional memory" of the model, which requires pure "brute force", that is, infinite computing power. For example, the storage space of all nine digit social security numbers only takes a few GPU hours to complete, while the data size of all 16 digit credit card numbers takes thousands of GPU years to enumerate.

    At present, as long as the quantification of this "unintentional memory" is available, the security of sensitive training data will be controlled within a certain range. That is to know how much training data a model has stored and how much has been over memorized, so as to train a model leading to the optimal solution to help people judge the sensitivity of data and the possibility of model leaking data.

    In the past, we mentioned AI industrialization, mostly focusing on some macro level, how to eliminate algorithm bias, how to avoid the black box nature of complex neural networks, and how to "grounded" to achieve the implementation of technical dividends. Now, with the gradual completion of basic transformation and concept popularization, AI will move towards refinement and micro level iterative upgrading, which may be the future that the industry is looking forward to.

    artificial intelligence the subconscious deep learning
    Previous Article Thousands of writers join letter urging AI industry to stop stealing books
    Next Article The "Dirty Work" Artificial Intelligence Cannot Do - Commercial Content Auditing

    Related Articles

    AI

    Gender equality issues plague the enterprise, and this SaaS company intends to use AI to solve them

    Jul 24, 2025
    AI

    How does the meta universe "feed" artificial intelligence models?

    Jun 29, 2025
    AI

    Can AI Painting Replace Human Painters

    Jul 01, 2025
    Most Popular

    Most enterprises expect a return on investment within one year of IoT deployment

    Jun 29, 2025

    Business Intelligence BI Industry Knowledge - Aerospace, Satellite Internet Industry

    Jul 13, 2025

    Which one to choose for mobile power? Analysis of the three major types of battery cells

    Jul 28, 2025
    Copyright © 2025 itheroe.com. All rights reserved. User Agreement | Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.