# End-of-chapter quiz[[end-of-chapter-quiz]]

This chapter covered a lot of ground! Don't worry if you didn't grasp all the details; the next chapters will help you understand how things work under the hood.

Before moving on, though, let's test what you learned in this chapter.

### 1. The `load_dataset()` function in 🤗 Datasets allows you to load a dataset from which of the following locations? 

data_files argument of load_dataset() to load local datasets.",
			correct: true
		},
		{
			text: "The Hugging Face Hub",
			explain: "Correct! You can load datasets on the Hub by providing the dataset ID, e.g. load_dataset('emotion').",
			correct: true
		},
		{
			text: "A remote server",
			explain: "Correct! You can pass URLs to the data_files argument of load_dataset() to load remote files.",
			correct: true
		},
	]}
/>

### 2. Suppose you load one of the GLUE tasks as follows:

```py
from datasets import load_dataset

dataset = load_dataset("glue", "mrpc", split="train")
```

Which of the following commands will produce a random sample of 50 elements from `dataset`?

dataset.sample(50)",
			explain: "This is incorrect -- there is no Dataset.sample() method."
		},
		{
			text: "dataset.shuffle().select(range(50))",
			explain: "Correct! As you saw in this chapter, you first shuffle the dataset and then select the samples from it.",
			correct: true
		},
		{
			text: "dataset.select(range(50)).shuffle()",
			explain: "This is incorrect -- although the code will run, it will only shuffle the first 50 elements in the dataset."
		}
	]}
/>

### 3. Suppose you have a dataset about household pets called `pets_dataset`, which has a `name` column that denotes the name of each pet. Which of the following approaches would allow you to filter the dataset for all pets whose names start with the letter "L"?

pets_dataset.filter(lambda x : x['name'].startswith('L'))",
			explain: "Correct! Using a Python lambda function for these quick filters is a great idea. Can you think of another solution?",
			correct: true
		},
		{
			text: "pets_dataset.filter(lambda x['name'].startswith('L'))",
			explain: "This is incorrect -- a lambda function takes the general form lambda *arguments* : *expression*, so you need to provide arguments in this case."
		},
		{
			text: "Create a function like def filter_names(x): return x['name'].startswith('L') and run pets_dataset.filter(filter_names).",
			explain: "Correct! Just like with Dataset.map(), you can pass explicit functions to Dataset.filter(). This is useful when you have some complex logic that isn't suitable for a short lambda function. Which of the other solutions would work?",
			correct: true
		}
	]}
/>

### 4. What is memory mapping?

### 5. Which of the following are the main benefits of memory mapping?

### 6. Why does the following code fail?

```py
from datasets import load_dataset

dataset = load_dataset("allocine", streaming=True, split="train")
dataset[0]
```

IterableDataset.",
			explain: "Correct! An IterableDataset is a generator, not a container, so you should access its elements using next(iter(dataset)).",
			correct: true
		},
		{
			text: "The allocine dataset doesn't have a train split.",
			explain: "This is incorrect -- check out the [allocine dataset card](https://huggingface.co/datasets/allocine) on the Hub to see which splits it contains."
		}
	]}
/>

### 7. Which of the following are the main benefits of creating a dataset card?

### 8. What is semantic search?

### 9. For asymmetric semantic search, you usually have:

### 10. Can I use 🤗 Datasets to load data for use in other domains, like speech processing?

MNIST dataset on the Hub for a computer vision example."
		},
		{
			text: "Yes",
			explain: "Correct! Check out the exciting developments with speech and vision in the 🤗 Transformers library to see how 🤗 Datasets is used in these domains.",
			correct : true
		},
	]}
/>

