Is it even possible for humanity to create Artificial General Intelligence (AGI), and if so, are we on the right track?
Andrew Widjaja — Sinergi Wahana Gemilang with Andi Sama
By now, most of us would have been exposed to chatGPT somehow, especially since the public release of chatGPT version 3.5 in November 2022, which has been provided for free by OpenAI. The simple user interface has accelerated advancements in this area of Generative Pre-Trained (GPT) Large-Language Model (LLM) in which we, as humans, can ‘communicate’ with the machine (the AI — Artificial Intelligence) in a natural way as if we are communicating with another human.
The following screenshot shows the typical web-based user interface for the chatGPT-3.5 application based on GPT-3.5.
The conversation below shows chatGPT-3.5’s response when using the following prompt: “You are an expert international chef. Please create a creative recipe for a fine dining dessert for the recently married couple spending a weekend in a 5-star resort hotel. Limit the response to no more than 100 words. Present in table format if suitable.”
Sounds promising as a “human” partner to replace Google search? Well, that’s something that it can do. Most of the time, it can reply with a compelling response, and we tend to believe the response, although, to some extent, it can hallucinate. Yes, hallucinate. The response can sound convincing; however, it may not be the correct answer when we do the fact check.
We can make follow-up conversations with chatGPT to ask it to elaborate further on the previous responses, creating the chained questions and answers — the chained requests and responses.
As a large language model with more than 175 billion hyper-parameters in its neural network model, chatGPT-3.5 has been trained with a huge dataset (until January 2022) to “complete” the sentence we send to the model. The free version of chatGPT-3.5 has a limitation of 4096 tokens.
In contrast, the current paid version of GPT-4 (accessed through its APIs — Application Programming Interface) extends this to 128 thousand tokens along with faster response. GPT-4 was said (Matthias Bastian, 2023) to have 1.76 trillion hyper-parameters consisting of 8 sub-models with 220 billions each.
chatGPT is the application with a provided user interface utilizing the GPT LLM model. It can be based on the free version of GPT 3.5 (GPT-3.5), GPT-3.5-turbo, or the paid version of GPT-4 (GPT-4 and the latest GPT-4-turbo).
A quick prompt to chatGPT asking for its limitations reveals the following information.
I’m a computer program, so I don’t have personal limitations like a human. However, there are some things to keep in mind when interacting with me:
- Lack of Personal Experience: I don’t have feelings, consciousness, or personal experiences. My responses are generated based on patterns in data I’ve been trained on.
- No Real-Time Information: My knowledge is based on information available up until my last training cut-off in January 2022. I’m not aware of events or information that have come out after that date.
- Text-Based Interaction: I can’t interact with the physical world, see, hear, or understand audio or images. I respond exclusively through text.
- Accuracy and Verification: While I strive for accuracy, I can still generate incorrect or misleading information. Always double-check critical information from trusted sources.
- Security and Privacy: Don’t share sensitive personal information with me, as I don’t have the capability to guarantee confidentiality.
- Limit on Output Length: I can generate responses of varying lengths, but there are practical limits to the length of a single response.
- Not a Replacement for Professional Advice: While I can provide information and suggestions, I am not a substitute for professional advice in fields like medicine, law, finance, etc. Always consult a qualified expert for such matters.
Remember, my responses are based on patterns in data and should be used with discernment and common sense. If in doubt, it’s always best to seek advice from a qualified human expert.
Being able to master building the prompts to suit various vertical industry use cases, hence “communicating with the LLM in the optimized way to produce the best possible result,” may land you an excellent job in the USA with about USD 300K annually (gross). Initially, you can start with the documentation from OpenAI (OpenAI, 2023c). For those who want to pursue further, the prompt marketplace for LLM may be an excellent site as a reference (PromptBase, 2023).
A Quick Look at Artificial Intelligence
LLM, as we know, is the advancement from Deep Learning (DL), which is a subset of Machine Learning (ML) and also a subset of AI within the Computer Science (CS) discipline.
The AI model is a trained machine learning or deep learning model. An AI model is typically trained using a dataset from a specific domain. Therefore, given new input data, the AI model can predict the output based on its trained dataset. Doing the prediction is doing the inference, meaning running the AI model in the production environment.
Training and Inferencing an AI model
The LLM training facility is only reachable for a select few researchers. GPT-3.5 was said to be trained on the massive infrastructure with thousands of high-end GPU (Graphics Processing Units) hardware in a distributed and scalable infrastructure (including CPU — Central Processing Unit, distributed storage, and software frameworks), usually a cloud-based platform. Once the model is created, doing the inference is also a big challenge, although it does not require the infrastructure as huge as when it was trained.
Luckily, we do not always need to train the base AI model from scratch. Transfer learning and fine-tuning from a base AI model can help. We can download the base AI model from a repository such as Hugging Face (Hugging Face, 2023) and then retrain the model by introducing our specific dataset based on the downloaded base AI model.
The LLM model is the trained AI model using datasets from multiple domains. As it has been trained with massive multi-domain datasets, it can “complete” the sentences given to it up to a certain number of tokens, now thousands of tokens even for the free version of chatGPT-3.5 utilizing the GPT-3.5 model.
Experimenting with GPT-4 and Dall-E 3
The latest OpenAI model for LLM is GPT-4 (for text-to-text natural language completion). The model for text-to-image generation is DALL-E 3. Similar two conversations mentioned before with chatGPT-3.5 are illustrated below. This time, we are using the paid version, GPT-4.
Designing an Interactive App with GPT-4 and DALL-E 3
GPT-4 and DALL-E 3 models run on the OpenAI cloud. Registered users with authorized access can access the APIs directly using REST-API tools like Postman.
We will now discuss building an app to access GPT-4 and DALL-E 3 through OpenAI’s API. The first example of text-to-image generation with DALL-E 3 is shown below. GPT-4 and DALL-E 3 models run on the OpenAI platform and are accessed as APIs.
We leverage Telegram to use GPT-4 and DALL-E 3 models in this experiment. The workflow illustrates how Telegram communicates to the OpenAI Platform through a Telegram bot and a custom service (written in Python) running on Google Cloud Run, a managed compute platform that can run stateless containers. The user sends a message to a bot through the Telegram client app.
Using the webhook mechanism, the custom service will get updates from the Telegram platform. The custom service will invoke the appropriate model on the OpenAI platform based on the retrieved message. Invoking GPT if it is just a text, or DALL-E instead if the retrieved message started with “generate” or “/generate.”
The custom service runs on Google Cloud Run and is configured using CPU only and allocated during request processing. This means the custom service will be temporarily activated only when Telegram calls it and starts processing the retrieved data. If no applications call, it will shut down automatically — releasing the consumed resources.
DALL-E 3, More Examples
More examples of text-to-image generation with different prompts are shown below.
Code Generation
GPT-3.5 and GPT-4 can generate code (Mukund Kapoor, 2023). By the following prompt “create a [C] language script to parse [csv] and extract [generative, ai, is, an, exciting, development] with the following requirement [create in a function callable from main()],” it can generate the code as follows.
This generated code can serve as a template for us to work on. Before using the code, we should verify whether it generates the correct logic as intended.
Towards the Future
Current advancements are the cumulation of years of R&D. The research evolution started about a decade ago when Alexnet was introduced as the promising and functioning deep-learning-based neural network supported by the ImageNet dataset.
Generative AI models have been made possible since the invention of the Generative Adversary Network (GAN) in 2014 and a breakthrough paper released in 2017 called Transformer the “Attention is All You Need.” Since then, various generative models have been introduced, including the first known democratized models by OpenAI: LLM-based GPT-4 and DALL-E 3.
The free version of GPT-3.5 has been trained with the dataset up to January 2022, while the dataset used for training the paid version of GPT-4 was as recent as April 2023.
There is another open-source LLM by Meta (formerly Facebook) called LLAMA. Llama2 model (e.g., with its 13 billion hyper-parameters with a file size of about 7.4GB) can be downloaded from a repository and run locally (Adam Conway, 2023) on the local Mac or Windows machine. The model will run faster if the local computer is equipped with GPUs.
The previous example on code generation with the same prompt, if executed using the llama2–13b model, produced the following output. Again, we should verify the correctness of the generated code.
The recent advancements in quantum computing (Andi Sama, 2023a) may also open up a new way to train the AI model. Quantum Machine Learning (QML) has been initially explored for transfer learning (Andi Sama, 2020), in which the AI model (deep-learning based) was first trained classically. Some parts of the neural network layers (usually the last few layers) were modified using the quantum approach, combining classical and real quantum computers (classical-quantum hybrid approach). We may need to wait for ten more years or so until a universal quantum computer has enough quantum bits (qubits) to perform useful processing (Andi Sama, 2022).
Further information on the advancements of technology can be found in SWG Insight digital magazines (SWG, 2023). Released in 2011, SWG Insight is Sinergi Wahana Gemilang (SWG)’s quarterly magazine discussing various technology advancements such as Bigdata, Security, the Internet of Things (IoT), Artificial Intelligence (AI), Blockchain, Metaverse, and Quantum Computing.
References
- Adam Conway, 2023, “How to run Llama 2 locally on your Mac or PC,” https://www.xda-developers.com/how-to-run-llama-2-locally-on-your-pc/.
- Andi Sama, 2023a, “Will You be Ready when Quantum Computing becomes Mainstream?” https://andisama.medium.com/will-you-be-ready-when-quantum-computing-becomes-mainstream-7ef982f5367a.
- Andi Sama, 2023b, “Society 5.0,” https://medium.com/@andisama/society-5-0-professors-talk-summary-by-prof-dr-peter-cochrane-obe-d-sc-ac605ede3b6f.
- Andi Sama, Cahyati S. Sangaji, Nurholis Seha, Andrew Widjaja, and Tommy Manik, 2023, “The State of Data Towards Digital Transformation in the Manufacturing Industry,” https://andisama.medium.com/the-state-of-data-towards-digital-transformation-in-the-manufacturing-industry-5a8bdd47462a.
- Andi Sama, 2022, “Quantum.Tech 2022: Next Generation Insights of Technology Innovations,” https://andisama.medium.com/quantum-tech-2022-next-generation-insights-of-technology-innovations-4a88d83a2a26.
- Andi Sama, 2021, “Experimenting with GPT-3, OpenAI’s Most Sophisticated Language Model,” https://andisama.medium.com/experimenting-with-gpt-3-openais-most-sophisticated-language-model-900e5f274077.
- Andi Sama, 2020, “Hello Tomorrow — I am a Hybrid Quantum Machine Learning,” https://andisama.medium.com/hello-tomorrow-i-am-a-hybrid-qml-b70751e36142.
- Gilles Legoux, 2023, “History of the Generative AI,” https://medium.com/@glegoux/history-of-the-generative-ai-aa1aa7c63f3c
- Hugging Face, 2023, “The AI community building the future: The platform where the machine learning community collaborates on models, datasets, and applications,” https://huggingface.co/.
- Matthias Bastian, 2023, “GPT-4 has more than a trillion parameters — Report,” https://the-decoder.com/gpt-4-has-a-trillion-parameters/.
- Mukund Kapoor, 2023, “100 Best Coding Prompts For ChatGPT,” https://www.greataiprompts.com/chat-gpt/best-coding-prompts-for-chat-gpt/
- OpenAI, 2023a, “GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses,” https://openai.com/gpt-4.
- OpenAI, 2023b, “OpenAI API References,” https://platform.openai.com/docs/api-reference.
- OpenAI, 2023c, “Prompt Engineering,” https://platform.openai.com/docs/guides/prompt-engineering.
- PromptBase, 2023, “Midjourney, ChatGPT, DALL·E, Stable Diffusion & more Prompt Marketplace,” https://promptbase.com/.
- SWG, 2023, “Sinergi Wahana Gemilang at Facebook,” https://www.facebook.com/SinergiWahanaGemilang.