# N|uu recordings and transcriptions

This data collection contains recordings and transcriptions of the
N|uu language.  This includes N|uu recordings, South African Nama and
a local variety of Afrikaans known by the speakers as "Onse
Afrikaans" or "Our Afrikaans".  All data collected between 2001 and
2022 were collected from mother tongue speakers of the target
languages on site in Upington, Askham and Witdraai in the Northern
Cape.


## Files and directories


### transcriptions.ods

This spreadsheet contains the transcriptions and related information.
This information is organized on several sheets:

+ Lexicon: This sheet contains a list of N|uu words and their
  corresponding information (see below for a description of the
columns).
+ Abbreviations: This sheet contains a list of the abbreviations used
  in the information in the spreadsheet.
+ Peoples names: This sheet contains a list of names of people and
  their corresponding information (see below for a description of the
columns).
+ Greetings: This sheet contains a list of greetings and their
  corresponding information (see below for a description of the
columns).
+ Place names: This sheet contains a list of names of places and their
  corresponding information (see below for a description of the
columns).
+ Kliptaal: This sheet contains a list of Kliptaal words and their
  corresponding information (see below for a description of the
columns).
+ Local_AfrikaansNamaSetswana: This sheet contains a list of words
  that were collected, which come from other languages (Local
Afrikaans, Nama, or Setswana) and their corresponding information (see
below for a description of the columns).

The different sheets have several columns:

+ Orthography 1: This column contains the orthography of the N|uu
  words, in both eastern and western dialects where applicable.  Note
that the Kliptaal sheet has words from Kliptaal and
Local_AfrikaansNamaSetswana sheet has words that are used in N|uu, but
come from languages (Local Afrikaans, Nama, or Setswana).
+ English: This column contains the English translation of the word in
  the Orthography 1 column.
+ Part of Speech, English: This column contains the part of speech of
  the English translation.
+ Parentheticals, English: This column contains additional information
in English about the word in the Orthography 1 column.
+ Recordings (target word in sentence): This column contains names of
  the audio files for recordings of the word in the Orthography 1
column in a sentence.  These files can be found in the audio
directory.
+ Dictionary Recording (target word only): This column contains names
  of the audio files for recordings made in the N|uu language for the
word in the Orthography 1 column only.
+ Semantic code: This column contains semantic codes identifying the
  semantic category of the word in the Orthography 1 column.
+ IPA: This column contains the International Phonetic Alphabet (IPA)
  representation for the word in the Orthography 1 column.
+ Short Notes: This column contains additional notes on the word in
  the Orthography 1 column.
+ Afrikaans community feedback HEADWORD: This column contains the
  standard Afrikaans translation of the word in the Orthography 1
column.
+ Afrikaans community feedback Local Variety: This column contains the
  local variety of Afrikaans (Onse Afrikaans) translation of the word
in the Orthography 1 column if it is different from standard
Afrikaans.
+ Afrik Parentheticals: This column contains additional information in
  Afrikaans about the word in the Orthography 1 column.
+ Nama Feedback: This column contains the Nama translation of the word
  in the Orthography 1 column, which is marked for the difference in
Namibian Nama (marked with NKK) and South African Nama if there is a
difference.
+ Nama Feedback: This column contains additional information in Nama
  about the word in the Orthography 1 column.
+ Nama recording: This column contains names of the audio files for
  recordings of the Nama words for the word in the Nama Feedback
column.
+ Notes: This column contains additional notes.
+ Additional recordings: This column contains names of the audio files
  of additional recordings related to the 
word in the Orthography 1 column.
+ Possible cognates in...; These columns contain possible cognates
  from other related languages (as indicated by the text following
"Possible cognates in".  Note that this is not a complete analysis.
+ Abbreviation: This column contains abbreviations that are used in
  this data collection.
+ Nama term: This column contains the Nama translation of the
  abbreviation in the Abbreviation column.
+ ENG: This column contains the English translation of the
  abbreviation in the Abbreviation column.
+ Eng abbrev.: This column contains the English abbreviation of the
  abbreviation in the Abbreviation column.
+ Afr. abbrev.: This column contains the Afrikaans abbreviation of the
  abbreviation in the Abbreviation column.
+ Afrikaans: This column contains the Afrikaans translation of the
  abbreviation in the Abbreviation column.
+ GPS Coordinates: This column contains GPS coordinates of the place
  names mentioned in the Orthography 1 column in the Place names
sheet.

### audio directory

This directory contains the audio recordings.  These recordings have
been collected over time by several field linguists.  As such, no
consistent file naming scheme is used.  The file names found in the
transcriptions.ods file correspond to those in the audio directory,
but additional audio files are present.

Within the file names, different speakers are encoded using two letter
abbreviations:

| Abbreviation | Name               |
| :----------- | :----------------- |
| AK           | Antjie Kassie      |
| UR           | \|Una Rooi         |
| KB           | Kheis Brou         |
| KE           | Katrina Esau       |
| AO           | Andries Oleyn      |
| GS           | Griet Seekoei      |
| FK           | Vytjie Koper       |
| SS           | Simon Sauls        |
| JK           | Johana Koper       |
| HK           | Hannie Koerant     |
| EV           | Elsie Vaalbooi[^1] |

[^1]: This data collection does not contain recordings of this speaker.

The STE- files do not have a speaker label, but all are recordings by
KE.  If a file name does not have speaker information (apart from the
STE- files), then the speaker is not explicitly marked.  Note that
speaker markers can occur in upper or lower case.  They are typically
at the end of the file name or separated by underscores from the rest
of the file name.

The data collection contains 20,382 audio files.  These files use the
following audio encodings.

| Audio encoding                                                                | Number of files | 
| :---------------------------------------------------------------------------- | --------------: | 
| 16-bit linear PCM, mono, 22050 Hz                                             |               2 |
| IFF data, AIFF audio                                                          |               1 |
| IFF data, AIFF-C compressed audio                                             |               3 |
| RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 16000 Hz   |              45 |
| RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 22050 Hz   |            9878 |
| RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 44100 Hz   |            8850 |
| RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 48000 Hz   |              27 |
| RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz |            1564 |
| RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 48000 Hz |              11 |
| RIFF (little-endian) data, WAVE audio, mono 22050 Hz                          |               1 |

### audio_files.txt

This file contains a list of files that can be found in the audio
directory.


## Authors

Authors are ordered alphabetically on last name.

Christopher T Collins  
New York University  
10 Washington Place #411  
New York, NY 10003  
USA  
<cc116@nyu.edu>

Kerry Jones  
African Tongue  
Kommetjie, Cape Town, 7975  
South Africa  
<jonesleekerry@gmail.com>

Bonny Sands  
Northern Arizona University  
P.O. Box 6032  
Flagstaff, AZ 86011  
USA  
<sands@lowell.edu>

Alena Witzlack-Makarevich  
Department of Linguistics  
The Hebrew University of Jerusalem  
Mt. Scopus, Jerusalem 91905  
Israel  
<witzlack@gmail.com>


## License

This data collection is made available under the following license:
Attribution-NonCommercial 3.0 South Africa (CC BY-NC 3.0 ZA)
[CC BY-NC 3.0 ZA](https://creativecommons.org/licenses/by-nc/3.0/za/)
