Create a sub-filter

Top  Previous  Next

The export files from commercial citation database providers have different formats for each reference type. For example, a journal article has a journal name, volume, and issue data that is not present for a book reference. For popular bibliographic record exchange formats like Refer and BibTeX, different tags are used for each reference type. That is why Biblioscape depends on reference type specific sub-filters to understand how each type of reference should be imported. To make a Biblioscape import filter function, you have to create at least one sub-filter.

 

import_filter_manage

New

Under the sub-filters list on the right pane, click the "New" button to create a new reference type specific sub-filter. Biblioscape will add an item "Select from the list" to your sub-filter list. Click on the text "Select from the list", and a drop down arrow will be shown on the right. Click on the arrow and select a reference type from the drop down list. If this is the first sub-filter you have created, all the properties will be blank. If this is not the first sub-filter, Biblioscape will inherit all sub-filter properties from the one you selected before clicking the New button. This will save you some work. Because lots of the properties are common among different sub-filters, you only need to edit the different parts when creating subsequent sub-filters. Because of this inherence feature, you need to be careful which sub-filter is selected before you click the "New" button. If you want to create a sub-filter for reference type "Book Edited", and you already have two sub-filters created for "Book" and "Journal Article", you should select the sub-filter "Book" before clicking the New button, because "Book Edited" will have more common properties with "Book" than with "Journal Article". To start working on the new sub-filter, click the Edit button or double click the sub-filter itself.

Edit

Clicking the Edit button will bring up the sub-filter edit window, so you can change the properties. Double clicking a sub-filter will have the same effect as clicking the Edit button.

Delete

Click the "Delete" button to delete all the selected sub-filters. You will be prompted before the deletion. To select more than one sub-filter, hold down the Ctrl key and select.

Default reference type

It is likely an import filter will not cover all the possible reference types present in an import file. There may be some reference types that rarely appear in a file. There are also some unexpected reference types that are not even present in your database settings, so you need to select a reference type as the default one to deal with those records. Biblioscape will use the default sub-filter to parse the record when no matching sub-filter can be found.

 

subFilter_match

Now we will cover the sub-filter edit window. This window is used to edit the parsing parameters for each reference type. The table on the left lists all available data fields for the current reference type. You have to provide the tag for each data field. The first and most important property to set is the reference type matching text. The text you entered below the "Matching text" label is used by Biblioscape to decide which sub-filter is to be used. On the "Match Fields" tab, you need to enter a tag in the first row "Reference Type". When the Biblioscape import engine sees this tag in the import file, it will check the text after this tag. If the text matches the one you entered inside the "Matching text" edit box, it will use this sub-filter. Each sub-filter's name is the reference type name. In the above example, if Biblioscape found the text "Journal Article" after the tag "%0 ", the sub-filter for Journal Article will be used. If there is no reference type tag field entered, Biblioscape will not be able to tell which sub-filter to use. The default sub-filter you set in the import filters window will be used for all the records instead.

Match Fields

There are two columns in the fields matching list. The first one is the reference type. The second one is the fields tag found in the import file. In the above example, the field tag "LA:" is mapped to the database field "Language". If Biblioscape found tag "LA:" in the import file, it will put all the text after the tag "LA:" into the database Language field. For some files, you may want to put text from two tag fields into a single database field. In the above screenshot, if you want to put the keywords text after the tag "MJ:" as well as the text after the tag "MN:" into the database Keywords field, you can type the tag "MJ:" first, then click the "+" button after the label "Match field to more than one tag". Biblioscape will insert the text "^[+]^" after "MJ:". You can then enter "MN:". If you do not wish to use the "+" button, you can type the tag separator "^[+]^" directly.

 

Regular expression

Again, if you know regular expression, it will increase the pattern matching power by several folds. You should put your regular expression text inside RE(...)RE. For example, when you want to map the tag "AU - " to the Authors field, AU may be followed by one or more spaces. You can use the following regular expression in such a case: RE(AU *- )RE. The asterisk tells Biblioscape: no matter how may spaces are found between "AU" and "-", treat this as a tag. To use regular expression to solve the problem in the previous section, you can type RE(MJ:|MN:)RE. The "|" sign tells Biblioscape that both "MJ:" and "MN:" should be treated as a tag.

 

Insert static text

subFilter_static

Some tag fields do not have corresponding database fields in Biblioscape. You may have to map them to one of the custom fields in the Biblioscape database, or to the Miscellaneous field which can take almost unlimited amounts of data. In such cases, you may want to insert some text before or after the imported tag field data. For example, if the tag field is: "Entry: 19991001" and you mapped it to the custom 8 field, only the text "19991001" will be imported into the database field "Custom 8". The tag "Entry: " will not be imported because it is a tag. If you want to add some text to remind you what the data is about, you can add "<[Date of entry: ]>" to the matching text field; next to "Custom 8", you will have "<[Date of entry: ]>Entry:" as the matching text. When Biblioscape sees the line "Entry: 19991001", it will put "19991001" into the Custom 8 database field, and add "Date of entry: " before that. The Custom 8 field data after the import will become: "Date of entry: 19991001". If you want to insert the static text after the imported text, you should use: Entry:[<: date of entry>]. The Custom 8 field data after the import will become: "19991001: date of entry". The rule is: use <[...]> to make it appear before the imported text, and use [<...>] to make it appear after the imported text.

Complex Fields

So far, we have covered simple, straight forward tag field / data field mapping. Now, let's see how to map one tag field to several data fields. Lots of commercial citation database providers put journal name, year, volume, issue, and pages after a single tag field. For example: "SO: Educational-Psychology. 2002 Mar; Vol 22(2): 219-233". You should enter the tag "SO: " as the matching text for Journal, Year, Volume, Issue, Page Start, and Page End fields. Then click the "Complex Fields" tab and drag data fields from the "Available fields" list to the "Parse sequence" list. You should arrange them in the same order as the data appeared. Next you need to select the data fields in the "Parse sequence" list one by one and enter the identification text before and after it so Biblioscape knows where each field starts and ends. Lets use a simple example to explain this. In the above screenshot, only two data fields are included. The tag field looks like "%P 234-237". For the "Start Page" field, there is no need to put anything in the box "ID text before selected field" because there is no text before the "Start Page". Type "-" in the box "ID text after selected field". That tells where the start page ends. Then click on "End Page", and put "-" in the box "ID text before selected field". This tells where the End Page starts. Leave the box "ID text after selected field" blank since there is no more data after End Page.

 

subFilter_complex

Special ID text

Sometimes the identification text between each data field cannot be expressed by simple text matching. For example, if month texts consistently appears before the publication year, and there is no other good identifier to use between publication year and the previous data, you can select "Month" and click the arrow button. The text "^[month]^" will be inserted into the ID text field.

 

Number (93): "^[number]^" will be used to match any integer.
OR: "^[OR]^" will be inserted into the ID text field. You should put it between two matching texts. For example, book editors will be followed by "ed. " if there is only one editor, and "eds. " if there are several editors. You can put OR relationships in between to make it look like: "ed. ^[OR]^eds. ".
Month (February): "^[month]^" will be used to match any month in the text. For example: January, February, ...".

Exclude Fields

Citation database providers may include tag fields that you don't need. Even if you don't want to import those tag fields, you still need to handle them in the import filter. Enter those tags in the "Exclude Fields" list, and when Biblioscape sees these tags, they will be removed. If they are not included in the "Exclude Fields" list, Biblioscape will treat it as part of the previous tag field.

subFilter_exclude

 

Regular expression

If you know regular expression, there is no need to use special ID text, because regular expression is many times more powerful. When using regular expression, put your regular expression string inside "RE(...)RE". You can exploit many opportunities that are not available using simple text matching. You can use regular expression to match field tags, exclude field tags, or for matching ID texts in complex fields. For example, if the "Authors" tag changes between two data providers and one uses "Authors: ", while the other uses "Author(s) : ", you can use the following regular expression as the matching text: RE(Author.*: )RE. Inside the RE quotes, ".*" means match any text from zero to n letters.