CamelEdge
productivity

Automating MCQ Creation with Streamlit, LLMs, and Google Forms Integration

Gray and Black Laptop Computer
Table Of Content

    By CamelEdge

    Updated on Mon Oct 14 2024


    In today's digital age, creating engaging and informative quizzes has become a valuable tool for education, assessments, and data collection. This blog post will guide you through building a web application using Streamlit, a popular Python library for creating data apps, and leveraging the power of a Language Model (LLM) to generate multiple-choice questions (MCQs). We'll then integrate the Google Forms API to seamlessly create interactive quizzes based on these generated questions. This approach offers a streamlined and efficient way to automate quiz creation for various purposes, from educational materials to market research surveys.

    Prerequisites

    Before we start, ensure you have the following:

    • Python installed on your machine.
    • A Google Cloud project with the Forms API enabled.
    • A service account with the necessary credentials to access the Forms API.
    • Streamlit installed (pip install streamlit).
    • llm such as open ai
    • langchain

    Setting Up the Streamlit Application

    Let's start by building a simple Streamlit application that allows us to upload a PDF document containing the lecture material. Once the PDF is uploaded, the app will automatically process the content and generate multiple-choice questions (MCQs) based on the lecture. This basic framework will serve as the foundation for extracting information from the document and converting it into useful educational assessments.

    The application will feature a graphical user interface (GUI), as shown below, providing an intuitive way to interact with the content and review the generated MCQs.

    Box filter

    Create a file named app.py with the following content:

    Import necessary libraries:

    import streamlit as st
    from langchain.document_loaders import PyPDFLoader
    from langchain.callbacks import get_openai_callback
    
    from llm import genMCQ  # will be defined later
    from form import create_google_quiz  # will be defined later
    
    • streamlit for creating the web app interface.
    • PyPDFLoader for loading PDF documents.
    • get_openai_callback for interacting with the OpenAI API.
    • genMCQ and create_google_quiz (custom functions) for generating MCQs and creating Google Quizzes. [we will implement these functions later in the blog]

    Creating the Web App Interface

    st.set_page_config(page_title='MCQ App', page_icon=':pencil2:')
    
    st.title('MCQ App')
    st.write('This application generates multiple-choice questions (MCQs) based on uploaded PDF files. Moreover, it can to create a Google Quiz using the MCQs generated by the app.')
    
    • Set page configuration: Sets the title and icon of the web app.
    • Display title and description: Provides a brief overview of the app's functionality.

    Handling OpenAI API Key

    with st.container():
        st.write("""Enter your OpenAI API key to get started.
    Don't have an API key? 
    Create one [here](https://platform.openai.com/account/api-keys).""")
        api_key = st.text_input('Enter your API Key here', type='password')
    
    • Prompt for API key: Asks the user to enter their OpenAI API key.

    Uploading PDF and Gathering LLM Inputs

    file = st.file_uploader('Upload Lecture content', type=['pdf'])
    
    with st.form("input form"):
        course_name = st.text_input('Course Name')
        num_mcqs = st.number_input('Enter the number of MCQs', min_value=1, max_value=20, value=10)
        temperature = st.slider('Creativity?', min_value=0.0, max_value=1.0, value=0.0, step=0.1, help='Adjust the creativity of the generated MCQs. Higher values result in more diverse and creative MCQs, while lower values result in more predictable MCQs')
    
    • Allow PDF upload: Lets the user upload a PDF file.
    • Collect LLM model inputs: Gets the course name, desired number of MCQs, and creativity level for the LLM.

    Processing the PDF and Generating MCQs

    if file is not None:
        # Process the file and generate MCQs
        with open('uploaded_file.pdf', 'wb') as f:
            f.write(file.getbuffer())         
        loader = PyPDFLoader('uploaded_file.pdf')
        pages = loader.load_and_split()
    
    submitted = st.form_submit_button("Generate", disabled=False if file else True)
    
    if submitted:
        with st.spinner('Generating...'):
            with get_openai_callback() as cb:
                st.session_state.mcqs = genMCQ(course_name, num_mcqs, pages, api_key, temperature)
                st.write(cb)
    
    • Load the PDF: Uses PyPDFLoader to load and split the PDF into pages.
    • Handle form submission: When the "Generate" button is clicked, start the MCQ generation process.
    • Generate MCQs: Call the genMCQ function with the provided inputs and API key to generate MCQs. [we will write the genMCQ function in the next section]

    Displaying and Downloading MCQs

    st.download_button('Download MCQs as JSON', data=st.session_state.mcqs if 'mcqs' in st.session_state else '', disabled=False if 'mcqs' in st.session_state else True)
    if 'mcqs' in st.session_state:
        st.json(st.session_state.mcqs, expanded=False)
    
    • Allow downloading MCQs: Provides a button to download the generated MCQs as a JSON file.
    • Display MCQs: If MCQs are generated, display them in a JSON format.

    Creating a Google Quiz

    This section automatically fills the Google Quiz you’ve created with the generated MCQs. Just make sure to add the app as a collaborator. We’ll cover this in more detail later in the blog.

    with st.form("Create Quiz"):
        st.text('Create Google Quiz')
        formId = st.text_input('Google Quiz/Form ID', help='Add the form ID of a new or an exiting google form. Remember to add the the apps email as collaborator to the Quiz')
        st.text('Apps Email (add this as collaborator): <Get the Email from Google Cloud Project>')
        
        submitted_create_quiz = st.form_submit_button("Create", disabled=False if 'mcqs' in st.session_state else True)
        
    if submitted_create_quiz:         
        with st.spinner('Creating Quiz...'):
            
            result = create_google_quiz(formId, st.session_state.mcqs)
            st.text(result)
    
    • Provide a form for creating a quiz: Asks the user to enter the Google Form ID.
    • Handle form submission: When the "Create" button is clicked, call the create_google_quiz [we will implement this later in the blog] function to create the quiz.
    • Display result: Show the outcome of the quiz creation process.

    Generating the MCQs

    The function (genMCQ) takes several inputs and uses an LLM (Large Language Model) to generate MCQs from a provided lecture content. Here's a breakdown of each step:

    from langchain.chat_models import ChatOpenAI
    from langchain import LLMChain
    from langchain.prompts.chat import (
        ChatPromptTemplate,
        SystemMessagePromptTemplate,
        HumanMessagePromptTemplate,
    )
    from langchain.text_splitter import CharacterTextSplitter
    
    def genMCQ(course_name, no_mcqs, pages, api_key, temperature):
    
    • course_name: Name of the course for which the MCQs are being generated.
    • no_mcqs: Total number of MCQs desired.
    • pages: List of objects representing the pages of the uploaded PDF.
    • api_key: OpenAI API key for accessing the LLM.
    • temperature: Controls the creativity of the generated MCQs (higher values lead to more diverse questions).
    • LangChain will be used to interact with the OpenAI model.

    Initialize LLM and Extract Lecture Text:

    mcqs = []
    chat = ChatOpenAI(temperature=temperature, openai_api_key=api_key)
    
    lecture_content = [p.page_content for p in pages]
    lecture_content = '\n'.join(lecture_content)
    
    • An instance of ChatOpenAI is created with the provided temperature and API key. This object represents the LLM used for generation.
    • The lecture_content variable extracts the text content from each page object in the pages list and combines them into a single string separated by newlines.

    Split Lecture Content:

    text_splitter = CharacterTextSplitter(
        chunk_size = 2000,
        chunk_overlap  = 200,
        separator= " "
    )
    docs = text_splitter.create_documents([lecture_content])
    no_docs = len(docs)
    
    • Define a text_splitter object that can split the lecture content into smaller chunks if needed for processing by the LLM.
    • docs is a list containing individual document objects from the split content. no_docs would then store the number of documents created.

    Calculate Number of MCQs per Document:

    no_of_mcqs = math.ceil(no_mcqs/no_docs)
    
    • This calculates the ideal number of MCQs to generate for each document based on the total desired count (no_mcqs) and the number of documents (no_docs).

    Define the MCQ format:

    Specify the MCQ format that will guide the LLM to generate multiple-choice questions in the desired structure

    mcq_format = [
        {
            "Question": "MCQ question 1 statement",
            "A": "choice a statement",
            "B": "choice b statement",
            "C": "choice c statement",
            "D": "choice d statement",
            "correct_choice": "only one option should be correct and output A,B,C or D"
        },
        {
            "Question": "MCQ question 2 statement",
            "A": "choice a statement",
            "B": "choice b statement",
            "C": "choice c statement",
            "D": "choice d statement",
            "correct_choice": "only one option should be correct and output A,B,C or D"
        }
    ]
    

    Loop Through Documents (or Entire Lecture Content):

    for i in range(no_docs):
      template="""You are a helpful teaching assistant for the course on {course_name} \
              that helps in creating MCQs after each lecture."""
      system_message_prompt = SystemMessagePromptTemplate.from_template(template)
      human_template="""Generate {no_mcqs} MCQs given the lecture content: 
              {lecture}
              Output a python list of MCQs. Example output of few MCQs is as follows.
              {mcq_format}
              Only output list of MCQs and nothing else. Conform to the above format. \
              Make sure that the generated MCQs are correct.
              """
      human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
      chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
      human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
            chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
            chain = LLMChain(llm=chat, prompt=chat_prompt)
            mcqs_ = chain.run(course_name=course_name, no_mcqs=no_of_mcqs, mcq_format=json.dumps(mcq_format), lecture=docs[i].page_content)
            mcqs.extend(json.loads(mcqs_))
    return json.dumps(mcqs)
    
    • The loop iterates through each document (docs) or the entire lecture content (lecture_content) depending on the usage of the text splitter.
    • Define prompts for the LLM conversation.
      • system_message_prompt: Sets the context by introducing the user as a teaching assistant for the course.
      • human_message_prompt: Instructs the LLM to generate the desired number of MCQs based on the provided lecture content and example format. It emphasizes returning only the list of MCQs in the specified format and ensuring their accuracy.
      • chat_prompt: Combines both prompts to guide the LLM conversation.

    Integrating MCQs into the Google Forms/Quiz

    The create_google_quiz function that takes a Google Form ID and a list of MCQs (in JSON format) as input and attempts to create a quiz in the specified form using the Google Forms API.

    Function Definition and Inputs:

    import json
    from google.oauth2 import service_account
    from googleapiclient.discovery import build
    def create_google_quiz(formId, mcqs):
    
    • formId: The ID of the Google Form where the quiz will be created.
    • mcqs: A JSON string containing a list of MCQs in the format defined earlier (question, answer choices, correct answer).

    Processing the MCQs:

    mcqs_ = json.loads(mcqs)
    
    • Converts the provided JSON string containing the MCQs into a Python list using json.loads.

    Google API Authentication:

    scopes = ['https://www.googleapis.com/auth/forms.body']
    credentials_path = './atomic-rune-391314-ef1d03d774b2.json'
    
    credentials = service_account.Credentials.from_service_account_file(credentials_path, scopes=scopes)
    service = build('forms', 'v1', credentials=credentials)
    
    • Defines the scope needed for accessing the Google Forms API specifically for editing forms (https://www.googleapis.com/auth/forms.body).
    • Points to the path of a service account credentials file (JSON) containing the access key. This file needs to be properly configured and placed in the same directory as the code. (You need to define your own in Google cloud platform)
    • Make sure to copy the email associated with your Google Cloud Project, as it will be added as a collaborator to the Google Form where you want to insert the MCQs.

    Converting Form to a Quiz:

    update = {
        "requests": [
            {
            "updateSettings": {
                "settings": {
                    "quizSettings": {
                        "isQuiz": True
                    }
                },
                "updateMask": "quizSettings"
            }
            }
        ]
    }
    
    request = service.forms().batchUpdate(
          body=update,
          formId=formId
          ).execute()
    
    • Creates an update request object for the Google Forms API.
    • This specific request sets the "isQuiz" property to True within the form settings, effectively converting it into a quiz.
    • Executes the update request using the forms API service.

    Looping Through MCQs and Creating Questions:

    count = 0
    for q in mcqs_:
    
        question = {
          "requests": [
            {
            "createItem": {
              "item": {
                  "title": q['Question'].replace('\n','\r'),
                  "questionItem": {
                    "question": {
                      "choiceQuestion": {
                        "options": [
                            {
                                "value": q['A']
                            },
                            {
                                "value": q['B']
                            },
                            {
                                "value": q['C']
                            },
                            {
                                "value": q['D']
                            }
                          ],
                          "type": "RADIO"
                        },
                        "required": True,
                        "grading": {
                          "pointValue": 1,
                          "correctAnswers": {
                            "answers": [
                              {
                                  "value": q[q['correct_choice']]
                              }
                            ]
                          }
                        }
                      }
                  }
                },
              "location": {
                "index": 0
              }
            }
          }
          ]
        }
    
        request = service.forms().batchUpdate(
            body=question,
            formId=formId
            ).execute()
        count += 1
    
    • Loops through each MCQ in the processed list (mcqs_).
    • Generates a question object for the Google Forms API, including the title, questions, answer choices, and the correct answer

    Run the app using:

    streamlit run app.py
    

    Watch the video below to learn how to use the app.


    GitHub repository: https://github.com/cameledge/generateMCQs

    Conclusion

    By combining the capabilities of Streamlit, Language Models, and the Google Forms API, we've created a robust tool for seamlessly generating and managing multiple-choice questions. This innovative approach not only streamlines the process of creating educational assessments but also elevates the level of interactivity and engagement in quizzes.