CS460 Introduction to Database Systems - Problem Set 2 - Converting the Oscar table to XML

Order Now

Problem Set 2

Part I due by 11:59 p.m. on Tuesday, October 8, 2024.
Part II due by 11:59 p.m. on Tuesday, October 15, 2024. Assignment Writing Service

Part I

40 points total Assignment Writing Service

Creating the necessary folder

Create a subfolder called ps2 within your cs460 folder, and put all of the files for this assignment in that folder. Assignment Writing Service

Creating the necessary file

This part of the assignment will all be completed in a single PDF file. To create it, you should do the following: Assignment Writing Service

Access the template that we have created by clicking on this link and signing into your Google account as needed. Assignment Writing Service
When asked, click on the Make a copy button, which will save a copy of the template file to your Google Drive. Assignment Writing Service
Select File->Rename, and change the name of the file to ps2_partI. Assignment Writing Service
Add your work for the problems from Part I to this file. Assignment Writing Service
Once you have completed all of these problems, choose File->Download->PDF document, and save the PDF file in your ps2 folder. The resulting PDF file (ps2_partI.pdf) is the one that you will submit. See the submission guidelines at the end of Part I. Assignment Writing Service

Problem 1: Fixed-length and variable-length records

19 points total Assignment Writing Service

Recall the Movie table from our movie database in Problem Set 1. Assume that we are using a simplified version of that table with the following schema: Assignment Writing Service

Movie(id CHAR(5), name VARCHAR(20), year INTEGER, runtime INTEGER, rating VARCHAR(5))

Consider the following tuple from that table: Assignment Writing Service

('14310', 'Deadpool', 2016, 108, 'R')

(3 points) What would this tuple look like if we stored it in a fixed-length record? In the 1.1 and 1.2 section of ps2_partI (see above), put your answer in the table labeled record contents. Assignment Writing Service

You should observe the following conventions: Assignment Writing Service
- Give each data value its own cell of the table. Adjust the widths of the cells as needed to better fit the sizes of the values, and Assignment Writing Service
(2 points) What is the length in bytes of the record from part 1? Assume that we are using: Assignment Writing Service
- four-byte integer field values Assignment Writing Service
- one-byte characters – including any digit characters that are part of a CHAR or VARCHAR. Assignment Writing Service
Put your final answer in the box labeled length in bytes, and show your work in the box below the answer. Assignment Writing Service
(3 points) What would this tuple look like if we stored it in a variable-length record in which each field is preceded by its length? Assignment Writing Service

In the 1.3 and 1.4 section of ps2_partI, put your answer in the table labeled record contents. Assignment Writing Service

In addition to the conventions that we specified for part 1, you should also give each metadata value its own cell of the table. Change the background color of cells containing metadata to distinguish them from cells containing actual data values. You can do so by using the icon that looks like a paint can in the menu bar at the top of Google Docs. Assignment Writing Service

In addition to the assumptions about the sizes of characters and integers that we gave you in part 2, you should assume that integers used for metadata are two bytes long (not four bytes). Assignment Writing Service
(2 points) What is the length in bytes of the record from part 3? Make the same assumptions stated in parts 2 and 3. Put your final answer in the box labeled length in bytes, and show your work in the box below the answer. Assignment Writing Service
(4 points) What would this tuple look like if we stored it in a variable-length record that begins with a header of offsets? Assignment Writing Service

In the 1.5 and 1.6 section of ps2_partI, put your answer in the table labeled record contents. Use the same conventions that we specified for parts 1 and 3, and use the same assumptions about the sizes of characters, integer field values, and integer metadata that we gave you in parts 2 and 3. Assignment Writing Service
(2 points) What is the length in bytes of the record from part 5? Put your final answer in the box labeled length in bytes, and show your work in the box below the answer. Assignment Writing Service
(3 points) Now consider the following Movie tuple: Assignment Writing Service
```
 ('12624', 'Wicked', 2024, NULL, 'PG')
```
This tuple is for the film adaptation of musical Wicked, which is coming to movie theatres this November. The NULL value for runtime reflects the fact that the movie’s runtime has not yet been officially announced. Assignment Writing Service

What would this tuple look like if we stored it in a variable-length record that begins with a header of offsets? Assignment Writing Service

In the 1.7 section of ps2_partI, put your answer in the table labeled record contents. You should use: Assignment Writing Service
- the approach to NULL values that we took in lecture Assignment Writing Service
- the same conventions that we specified for parts 1 and 3 Assignment Writing Service
- the same assumptions about the sizes of characters, integer field values, and integer metadata that we gave you in parts 2 and 3. Assignment Writing Service
There is no separate length-computation question for this record. Assignment Writing Service

Problem 2: Index structures

21 points total; 7 points each part Assignment Writing Service

Let’s say that you want to insert items with the following sequence of keys into a collection of records that uses some form of indexing: Assignment Writing Service

10, 11, 12, 4, 9, 14, 1, 2, 7, 19, 18, 5, 3

Insert this key sequence into an initially empty B-tree of order 2. In section 2.1 of ps2_partI, show the tree after each insertion that causes a split of one or more nodes, and the final tree. Assignment Writing Service

We have given you a sample diagram that includes nodes of different sizes. Make copies of the diagram so that you can use separate diagrams for the results of each insertion that causes a split, and for the final tree. Note that you do not need to keep the shape of the tree that we have given you. Rather, you should edit it as needed: deleting or adding nodes and edges, replacing the Xs with keys, adding or removing keys, and making whatever other changes are needed. Assignment Writing Service
Insert this same key sequence into an initially empty B+tree (note the +) of order 2. In section 2.2 of ps2_partI, show the tree after each insertion that causes a split of one or more nodes, and the final tree. Here again, you should make copies of the diagram that we have given you and edit them as needed. Assignment Writing Service
Insert this same key sequence into a hash table that uses linear hashing. In section 2.3 of ps2_partI, use the tables that we have provided to show the state of the table before and after each increase in the number of buckets, as well as the final state of the table. Assignment Writing Service

Important details: Assignment Writing Service
- The table should use the hash function h(x) = x, and it should start out with two empty buckets. Assignment Writing Service
- Within a given bucket, please list the keys in the order in which they were inserted. Assignment Writing Service
- Assume that a bucket is added whenever the number of items in the table exceeds three times the number of buckets. Assignment Writing Service
- An item that causes the table to grow should appear in both the before and after tables for that increase. See the PS 2 FAQ for more details. Assignment Writing Service

Submitting your work for Part I

Login to Gradescope by clicking the link in the left-hand navigation bar. Once you are in logged in, click on the box for CS 460. Assignment Writing Service

Submit your ps2_partI.pdf file using these steps: Assignment Writing Service

If you still need to create a PDF file, open your file on Google Drive, choose File->Download->PDF document, and save the PDF file in your ps2 folder. Assignment Writing Service
Click on PS 2: Part I in the list of assignments on Gradescope. You should see a pop-up window labeled Submit Assignment. (If you don’t see it, click the Submit or Resubmit button at the bottom of the page.) Assignment Writing Service
Choose the Submit PDF option, and then click the Select PDF button and find the ps1_partI.pdf that you created in step 1. Then click the Upload PDF button. Assignment Writing Service
You should see an outline of the problems along with thumbnails of the pages from your uploaded PDF. For each problem in the outline: Assignment Writing Service
- Click the title of the problem.
- Click the page(s) on which your work for that problem can be found.
As you do so, click on the magnifying glass icon for each page and doublecheck that the pages that you see contain the work that you want us to grade. Assignment Writing Service
Once you have assigned pages to all of the problems in the question outline, click the Submit button in the lower-right corner of the window. Assignment Writing Service
You should see a box saying that your submission was successful. Click the (x) button to close that box. Assignment Writing Service
You can use the Resubmit button at the bottom of the page to resubmit your work as many times as needed before the final deadline. Assignment Writing Service

Assignment Writing Service

Part II

60 points total Assignment Writing Service

Problem 3: Converting the Oscar table to XML

25 points; pair-optional Assignment Writing Service

This is the only problem of the assignment that you may complete with a partner. See the rules for working with a partner on pair-optional problems for details about how this type of collaboration must be structured. Assignment Writing Service

In this problem, you will write a series of methods that can be used to create an XML version of the Oscar table from Problem Set 1. Your methods will be part of a larger program that uses the JDBC framework to connect to the SQLite database that you used in PS 1 and to execute the SQL queries needed to extract the necessary data. Assignment Writing Service

Obtaining the necessary files

You should begin by downloading the following zip file:
problem3.zip Assignment Writing Service

Unzip/extract the contents of the file. Assignment Writing Service

Depending on your system, after extracting the contents you will either have: Assignment Writing Service

a folder named problem3 that contains all of the files that you need for this problem Assignment Writing Service
an outer folder called problem3 that contains an inner folder named problem3 that contains all of the Java files that you need for this problem. Assignment Writing Service

Take the problem3 folder that actually contains the necessary files and drag it into your ps2 folder so that you can easily find and open it from within VS Code. Assignment Writing Service

Getting started

Read through our overview of the JDBC framework. Assignment Writing Service
Launch VS Code on your laptop. Assignment Writing Service
In VS Code, select the File->Open Folder or File->Open menu option, and use the resulting dialog box to find and open the problem3 folder that you created above – the one that contains the provided files. (Note: You must open the folder; it is not sufficient to simply open one of the Java files in the folder.) Assignment Writing Service

The name of the folder should appear in the Explorer pane on the left-hand side of the VS Code window, along with a list of all of its contents. Assignment Writing Service
Click on the name XMLforOscars.java in the Explorer pane, which will open the file that you need to modify. Assignment Writing Service
Review all of the code that we’ve provided before you start writing any new code. See below for some additional information on what we’ve given you. Assignment Writing Service

The provided code

The class that you will be completing is called XMLforOscars. In this class we’ve given you: Assignment Writing Service

the constructor for the class, which takes the name of a SQLite file that should contain a relational database with the schema outlined in Problem Set 1; the constructor establishes a connection to the SQLite database, and it stores the resulting Connection object in a field called db. Assignment Writing Service
a helper method called simpleElem(), which takes as inputs the name and value of a simple XML element and returns a string of the form "<name>value</name>". This method should only be used to form simple elements – ones that do not have any attributes or child elements. Assignment Writing Service
a helper method called resultsFor(), which takes as input a string presenting a SQL query for the movie database and returns a ResultSet object that can be used to process the results of that query. You are welcome to use this method in the code that you write, although doing so is not required. Assignment Writing Service
a method called personIdFor(), which takes as input the name of a person and performs a query to find and return the person’s id; this method will be useful when testing the methods that you write. Assignment Writing Service
a method called movieIdFor(), which takes as input the name of a movie and performs a query to find and return the movie’s id; this method will be also useful when testing the methods that you write. (Note: Although movie names are not unique, we can safely ignore that fact for the purposes of this method.) Assignment Writing Service
a method called createFile(), which performs a query to obtain all distinct years in the Oscar Assignment Writing Service

main method in TestDriver.java): Assignment Writing Service
```
XMLforOscars xml = new XMLforOscars("movie.sqlite");
System.out.println(xml.movieElemFor(xml.movieIdFor("Black Panther")));
System.out.println(xml.movieElemFor("1234567"));   // no movie with that id
System.out.println(xml.movieElemFor(xml.movieIdFor("Barbie")));
```
you should see: Assignment Writing Service

Assignment Writing Service
```
      <movie id=”1825683”>Black Panther</movie>
       
       
      <movie id=”1517268”>Barbie</movie>
       
```
Assignment Writing Service

Notes: Assignment Writing Service
- You should see an extra blank line when you print the results of any call that produces a movie element (including the blank line after the results for Barbie shown above), because the string returned by the method should end with a newline, and the println method adds its own newline. Assignment Writing Service
- The second println statement prints only a blank line because there is no movie whose id is 1234567, and thus the call xml.movieElemFor("1234567") returns an empty string. As a result, there are two blank lines after the results for Black Panther: one after its element, and one from the printing of the empty string. Assignment Writing Service
Important guidelines: Assignment Writing Service
- You must begin by performing the appropriate SQL query. Use the personIdFor() and movieIdFor() methods as models for what you should do. Assignment Writing Service
- When processing the results, make sure to follow the approach given in our JDBC overview. Assignment Writing Service
- Because the movie element that you are forming has an attribute, you should not use the simpleElem() method to create it. Rather, you should construct it yourself using string concatenation. Assignment Writing Service
- The start tag of the returned movie element should be preceded by exactly six spaces. Assignment Writing Service
- Don’t forget to include the movie’s id as an attribute within the start tag for movie, as shown above. In order to include the quotes around the id, you will need to use the escape sequence "\"" for each double-quote character. Assignment Writing Service
- There should be a single newline character (\n) and no extra spaces after the end tag of the element. Assignment Writing Service
Implement the method called personElemFor() whose header we have provided. It takes a string representing the id number of a person, and it should return a string containing the XML for a single complex element of type person that includes nested child elements for: Assignment Writing Service
- the name of the person (which you may assume is never null) Assignment Writing Service
- the dob of the person (if it is non-null) Assignment Writing Service
In addition, the returned person element must have an attribute named id for the person’s id number. Assignment Writing Service

If there is no person with the specified id (including id values of null), the method should simply return the empty string. If there is a person with the specified id, the string returned by the method should end with a single newline character (\n). Assignment Writing Service

For example, if you run the following test code (adding it to the main method in TestDriver.java): Assignment Writing Service
```
XMLforOscars xml = new XMLforOscars("movie.sqlite");
System.out.println(xml.personElemFor(xml.personIdFor("Julianne Moore")));
System.out.println(xml.personElemFor("1234567"));
System.out.println(xml.personElemFor(xml.personIdFor("Chris Buck")));
```
you should see: Assignment Writing Service

Assignment Writing Service
```
      <person id=”0000194”>
        <name>Julianne Moore</name>
        <dob>1960-12-03</dob>
      </person>
       
       
      <person id=”0118333”>
        <name>Chris Buck</name>
      </person>
       
```
Assignment Writing Service

Notes: Assignment Writing Service
- Chris Buck has a dob value of null in our database, so his person element does not include a nested dob element. Assignment Writing Service
- Here again, you should see an extra blank line when you print the results of any call that produces a person element (including the blank line after the results for Chris Buck shown above), because the string returned by the method should end with a newline, and the println method adds its own newline. Assignment Writing Service
- The second println statement prints only a blank line because there is no person whose id is 1234567, and thus the call xml.personElemFor("1234567") returns an empty string. As a result, there are two blank lines after the results for Julianne Moore: one after her element, and one from the printing of the empty string. Assignment Writing Service
Important guidelines: Assignment Writing Service
- Here again, you must begin by performing the appropriate SQL query and processing the results following the approach given in our JDBC overview. Assignment Writing Service
- The outer start and end tags of the returned person element each be on their own line preceded by exactly six spaces. Assignment Writing Service
- Don’t forget to include the person’s id as an attribute within the start tag for person, as shown above. In order to include the quotes around the id, you will need to use the escape sequence "\"" for each double-quote character. Assignment Writing Service
- You must use the provided simpleElem() method to form the name and dob child elements. Assignment Writing Service
- Each child element should be on its own line, and its start tag should be preceded by exactly eight spaces. Assignment Writing Service
- There should be no extra spaces at the end of any line of the returned string. Assignment Writing Service
Implement the method called awardElemFor() whose header we have provided. It takes three strings representing the type, person ID and movie ID of a single Oscar award, and it should return a string containing the XML for a complex element of type award that includes nested child elements for: Assignment Writing Service
- the award’s type (use simpleElem() to get this) Assignment Writing Service
- the person (if any) associated with the award (use personElemFor() to get this); see below for more details Assignment Writing Service
- the movie associated with the award (use movieElemFor() to get this). Assignment Writing Service
The string returned by the method should end with a single newline character (\n). Assignment Writing Service

For example, if you run the following test code: Assignment Writing Service
```
XMLforOscars xml = new XMLforOscars("movie.sqlite");
String movieId = xml.movieIdFor("Oppenheimer");
String personId = xml.personIdFor("Cillian Murphy");
System.out.println(xml.awardElemFor("BEST-PICTURE", null, movieId));
System.out.println(xml.awardElemFor("BEST-ACTOR", personId, movieId));
```
you should see: Assignment Writing Service

Assignment Writing Service
```
    <award>
      <type>BEST-PICTURE</type>
      <movie id=”1539877”>Oppenheimer</movie>
    </award>
 
    <award>
      <type>BEST-ACTOR</type>
      <person id=”0614165”>
        <name>Cillian Murphy</name>
        <dob>1976-05-25</dob>
      </person>
      <movie id=”1539877”>Oppenheimer</movie>
    </award>
 
```
Assignment Writing Service

Here again, you should see an extra blank line when you print the results of a call to awardElemFor (including the blank lines after each of the award elements shown above), because the string returned by the method should end with a newline, and the println method adds its own newline. Assignment Writing Service

Important guidelines: Assignment Writing Service
- This method should not perform any queries of its own. Rather, you must use your previous methods to obtain the necessary child elements, as specified above. Assignment Writing Service
- For awards of type "BEST-PICTURE", our database does not store an associated person. As a result, the value of personId should be null in such cases, and you should not include a nested child element of type person in the string that you return for those awards. Assignment Writing Service
- Your method should throw an IllegalArgumentException when any of the following are true: Assignment Writing Service
  - the value of the type parameter is null
  - the value of movieId is null
  - the value of personId is null for type values other than BEST-PICTURE.
  With the exception of these cases, you may assume that the inputs to the method are otherwise valid. Assignment Writing Service
- The outer start tag and outer end tag should each be on their own line preceded by exactly four spaces and followed by a single newline character. Assignment Writing Service
- The type element should be on its own line preceded by exactly six spaces and followed by a single newline character. Assignment Writing Service
- The person and movie elements should each have the same spacing and formatting described in the earlier method specifications, so you won’t need to add any new spaces or newlines to them. Assignment Writing Service
- Once again, there should be no extra spaces at the end of any line. Assignment Writing Service
Implement the method called oscarsForYear() whose header we have provided. It takes a string representing a year, and it should return a string containing the XML for a single complex element of type oscars_for_year that includes: Assignment Writing Service
- a nested child element of type year for the specified year (use simpleElem() to get this) Assignment Writing Service
- a sequence of nested child elements of type award, one for each Oscar that was awarded in that year (use a separate call to awardElemFor() to obtain each of them). Assignment Writing Service
If there are no Oscars for the specified year, the method should simply return the empty string. In either case, the returned string should NOT end with a newline character. Assignment Writing Service

For example, if you run the following test code: Assignment Writing Service
```
XMLforOscars xml = new XMLforOscars("movie.sqlite");
System.out.println(xml.oscarsForYear("2024"));
```
you should see: Assignment Writing Service

Assignment Writing Service
```
  <oscars_for_year>
    <year>2024</year>
    <award>
      <type>BEST-PICTURE</type>
      <movie id=”1539877”>Oppenheimer</movie>
    </award>
    <award>
      <type>BEST-ACTOR</type>
      <person id=”0614165”>
        <name>Cillian Murphy</name>
        <dob>1976-05-25</dob>
      </person>
      <movie id=”1539877”>Oppenheimer</movie>
    </award>
    <award>
      <type>BEST-ACTRESS</type>
      <person id=”1297015”>
        <name>Emma Stone</name>
        <dob>1988-11-06</dob>
      </person>
      <movie id=”1423045”>Poor Things</movie>
    </award>
    <award>
      <type>BEST-SUPPORTING-ACTRESS</type>
      <person id=”5007768”>
        <name>Da’Vine Joy Randolph</name>
        <dob>1986-05-21</dob>
      </person>
      <movie id=”1484919”>The Holdovers</movie>
    </award>
    <award>
      <type>BEST-SUPPORTING-ACTOR</type>
      <person id=”0000375”>
        <name>Robert Downey Jr.</name>
        <dob>1965-04-04</dob>
      </person>
      <movie id=”1539877”>Oppenheimer</movie>
    </award>
    <award>
      <type>BEST-DIRECTOR</type>
      <person id=”0634240”>
        <name>Christopher Nolan</name>
        <dob>1970-07-30</dob>
      </person>
      <movie id=”1539877”>Oppenheimer</movie>
    </award>
  </oscars_for_year>
```
Assignment Writing Service

Important guidelines: Assignment Writing Service
- You must begin by performing the appropriate SQL query and processing the results following the approach given in our JDBC overview. Assignment Writing Service
- You shouldn’t make any assumptions about the number of awards in a given year. Rather, your code should be able to handle an arbitrary number of awards. This means that you will need to use a loop to process the results of your query, similar to the way that the createFile() method uses a loop. If there are no awards for the specified year, the method should return an empty string. Assignment Writing Service
- You must use the provided simpleElem() method to form the XML for the year element and your own awardElemFor() method to obtain the XML for each award. Assignment Writing Service
- The outer start tag and outer end tag should each be on their own line preceded by exactly two spaces. The start tag should be followed by a single newline character, but the end tag should not be. Assignment Writing Service
- The year element should be on its own line preceded by exactly four spaces and followed by a single newline character. Assignment Writing Service
- The award elements should each have the same spacing and formatting described in the previous method specification, so you won’t need to add any new spaces or newlines to them. Assignment Writing Service
- Once again, there should be no extra spaces at the end of any line. Assignment Writing Service

Once you have completed and tested all of your methods, running the XMLforOscars program should create a file named oscars.xml that represents the entire Oscar table in XML! Assignment Writing Service

Problem 4: Querying an XML database

30 points total Assignment Writing Service

This problem asks you to construct XPath and XQuery queries for an XML version of our entire movie database. The schema of this XML database is described here. Assignment Writing Service

Installing the software

To allow you to check your work, we’ll make use of a freely available XML DBMS called BaseX. You should begin by following the instructions for installing and using it that are available here. Assignment Writing Service

Performing queries in BaseX

As outlined in our instructions, you can perform queries by taking the following steps: Assignment Writing Service

Start up the BaseX GUI by double-clicking on the JAR file that you downloaded. Assignment Writing Service
Select the Database->New menu option, click the Browse button, and use the resulting dialog box to find the imdb.xml file that you downloaded above. Assignment Writing Service
Click Open to select the file, and click OK to create the database. Assignment Writing Service
To execute a query, enter it in the Editor pane in BaseX, and click the green play button to execute it. (You can also use Ctrl+Enter or Ctrl+Return for this purpose.) Assignment Writing Service
The results (if any) will be displayed in the Result pane. Assignment Writing Service

If you have trouble getting BaseX to work on your machine, see the troubleshooting tips on our BaseX page. Assignment Writing Service

Important guidelines

If you’re using a Mac, you should disable smart quotes, because they may lead to errors in BaseX and in our testing. There are instructions for doing so here. Assignment Writing Service
ps2_queries.py is a Python file, so you could use a Python IDE to edit it, but a regular text editor like TextEdit or Notepad++ would also be fine. However, if you use a text editor, you must ensure that you save it as a plain-text file. Assignment Writing Service
Construct the XQuery commands needed to solve the problems given below. Test each command in BaseX to make sure that it works. Assignment Writing Service
Once you have finalized the XQuery command for a given problem, copy the command into your ps2_queries.py file, putting it between the triple quotes provided for that query’s variable. We have included a sample query to show you what the format of your answers should look like. Assignment Writing Service
Each of the problems must be solved by means of a single query. Unless the problem specifies otherwise, you may use either a standalone XPath expression or an XQuery FLWOR expression. Assignment Writing Service
The only place that you may use a subquery (i.e., a nested FLWOR expression) is in the results clause of an outer FLWOR expression. You should NOT have a nested FLWOR expression in a for clause or a let clause. Assignment Writing Service
The order of the clauses in each query/subquery must follow the FLWOR acronym: a for clause (F), followed optionally by a let clause (L), followed optionally by a where clause (W), followed optionally by an order by clause (O), followed by a return clause (R). You should not put the clauses in a different order – e.g., for, followed by let, followed by another for, etc. BaseX may allow you to do this, but it is never necessary to do so, and such a query will often fail to run to completion in the Autograder. Assignment Writing Service
Your queries should only use information provided in the problem itself. In addition, they should work for any XML database that follows the schema that we have specified. Assignment Writing Service
When the results of a query include nested child elements, those child elements must be in the specified order with respect to each other. See the example results that are provided for each such problem. Assignment Writing Service
You do not need to worry about indenting and line breaks in the results of your queries. Assignment Writing Service