Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Cultural impacts on electronic publishing experience in Serbia

Identifieur interne : 000226 ( Istex/Corpus ); précédent : 000225; suivant : 000227

Cultural impacts on electronic publishing experience in Serbia

Auteurs : Du Ko Vitas ; Cvetana Krstev

Source :

RBID : ISTEX:06F30003BD4B2A843A15B75F4C9D4525B49229BE

Abstract

Discusses the linguistic influences on an electronic publishing infrastructure in an environment with unstable linguistic standardization from the computational point of view. Essentially, in Serbia in the last half of the century at least publishing is based on the following facts two alphabetic systems are regularly in use with the possibility to mix both alphabets in the same document the various dialects are accepted as a part of a linguistic norm orthography is unstable presently, several linguistic attitudes that have different views of the orthographic norm are under discussion and, in Serbia, many minority languages are in use, which makes it difficult to provide efficient contact between different communities through electronic publishing. In this context, a systematic solution that responds to this complex situation has not been developed in the frame of traditional Serbian linguistics and lexicography in a way that enables the adequate incorporation of the new publishing technologies. Owing to these constraints, the direct application of electronic publishing tools frequently causes the degradation of the linguistic message. In such an environment, the promotion of electronic publishing therefore needs specific solutions. The paper discusses the general frame based on the specifically encoded system of electronic dictionaries that makes electronic texts independent of some of the mentioned constraints. The objective of such a frame is to enable the linguistic normalization of texts at the level of their internal representation, and to establish bridges for communicating with other language societies. Some aspects of electronic text representation that ensures its correct interpretation in different graphical systems and in different dialects are described. This also allows text indexing and retrieval using the same techniques that are available for languages not burdened with these problems.

Url:
DOI: 10.1108/03074809910273278

Links to Exploration step

ISTEX:06F30003BD4B2A843A15B75F4C9D4525B49229BE

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Cultural impacts on electronic publishing experience in Serbia</title>
<author>
<name sortKey="Ko Vitas, Du" sort="Ko Vitas, Du" uniqKey="Ko Vitas D" first="Du" last="Ko Vitas">Du Ko Vitas</name>
<affiliation>
<mods:affiliation>Assistant Professor at the Computer Science Department of the Faculty of Mathematics, University of Belgrade, Yugoslavia</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Krstev, Cvetana" sort="Krstev, Cvetana" uniqKey="Krstev C" first="Cvetana" last="Krstev">Cvetana Krstev</name>
<affiliation>
<mods:affiliation>Assistant Professor at the Library and Information Science Department of the Philological Faculty, both at the University of Belgrade, Yugoslavia</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:06F30003BD4B2A843A15B75F4C9D4525B49229BE</idno>
<date when="1999" year="1999">1999</date>
<idno type="doi">10.1108/03074809910273278</idno>
<idno type="url">https://api.istex.fr/document/06F30003BD4B2A843A15B75F4C9D4525B49229BE/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000226</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Cultural impacts on electronic publishing experience in Serbia</title>
<author>
<name sortKey="Ko Vitas, Du" sort="Ko Vitas, Du" uniqKey="Ko Vitas D" first="Du" last="Ko Vitas">Du Ko Vitas</name>
<affiliation>
<mods:affiliation>Assistant Professor at the Computer Science Department of the Faculty of Mathematics, University of Belgrade, Yugoslavia</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Krstev, Cvetana" sort="Krstev, Cvetana" uniqKey="Krstev C" first="Cvetana" last="Krstev">Cvetana Krstev</name>
<affiliation>
<mods:affiliation>Assistant Professor at the Library and Information Science Department of the Philological Faculty, both at the University of Belgrade, Yugoslavia</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">New Library World</title>
<idno type="ISSN">0307-4803</idno>
<imprint>
<publisher>MCB UP Ltd</publisher>
<date type="published" when="1999-07-01">1999-07-01</date>
<biblScope unit="volume">100</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="171">171</biblScope>
<biblScope unit="page" to="179">179</biblScope>
</imprint>
<idno type="ISSN">0307-4803</idno>
</series>
<idno type="istex">06F30003BD4B2A843A15B75F4C9D4525B49229BE</idno>
<idno type="DOI">10.1108/03074809910273278</idno>
<idno type="filenameID">0721000404</idno>
<idno type="original-pdf">0721000404.pdf</idno>
<idno type="href">03074809910273278.pdf</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0307-4803</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Discusses the linguistic influences on an electronic publishing infrastructure in an environment with unstable linguistic standardization from the computational point of view. Essentially, in Serbia in the last half of the century at least publishing is based on the following facts two alphabetic systems are regularly in use with the possibility to mix both alphabets in the same document the various dialects are accepted as a part of a linguistic norm orthography is unstable presently, several linguistic attitudes that have different views of the orthographic norm are under discussion and, in Serbia, many minority languages are in use, which makes it difficult to provide efficient contact between different communities through electronic publishing. In this context, a systematic solution that responds to this complex situation has not been developed in the frame of traditional Serbian linguistics and lexicography in a way that enables the adequate incorporation of the new publishing technologies. Owing to these constraints, the direct application of electronic publishing tools frequently causes the degradation of the linguistic message. In such an environment, the promotion of electronic publishing therefore needs specific solutions. The paper discusses the general frame based on the specifically encoded system of electronic dictionaries that makes electronic texts independent of some of the mentioned constraints. The objective of such a frame is to enable the linguistic normalization of texts at the level of their internal representation, and to establish bridges for communicating with other language societies. Some aspects of electronic text representation that ensures its correct interpretation in different graphical systems and in different dialects are described. This also allows text indexing and retrieval using the same techniques that are available for languages not burdened with these problems.</div>
</front>
</TEI>
<istex>
<corpusName>emerald</corpusName>
<author>
<json:item>
<name>Du ko Vitas</name>
<affiliations>
<json:string>Assistant Professor at the Computer Science Department of the Faculty of Mathematics, University of Belgrade, Yugoslavia</json:string>
</affiliations>
</json:item>
<json:item>
<name>Cvetana Krstev</name>
<affiliations>
<json:string>Assistant Professor at the Library and Information Science Department of the Philological Faculty, both at the University of Belgrade, Yugoslavia</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Electronic publishing</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Language</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>National cultures</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Serbia</value>
</json:item>
</subject>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>research-article</json:string>
</originalGenre>
<abstract>Discusses the linguistic influences on an electronic publishing infrastructure in an environment with unstable linguistic standardization from the computational point of view. Essentially, in Serbia in the last half of the century at least publishing is based on the following facts two alphabetic systems are regularly in use with the possibility to mix both alphabets in the same document the various dialects are accepted as a part of a linguistic norm orthography is unstable presently, several linguistic attitudes that have different views of the orthographic norm are under discussion and, in Serbia, many minority languages are in use, which makes it difficult to provide efficient contact between different communities through electronic publishing. In this context, a systematic solution that responds to this complex situation has not been developed in the frame of traditional Serbian linguistics and lexicography in a way that enables the adequate incorporation of the new publishing technologies. Owing to these constraints, the direct application of electronic publishing tools frequently causes the degradation of the linguistic message. In such an environment, the promotion of electronic publishing therefore needs specific solutions. The paper discusses the general frame based on the specifically encoded system of electronic dictionaries that makes electronic texts independent of some of the mentioned constraints. The objective of such a frame is to enable the linguistic normalization of texts at the level of their internal representation, and to establish bridges for communicating with other language societies. Some aspects of electronic text representation that ensures its correct interpretation in different graphical systems and in different dialects are described. This also allows text indexing and retrieval using the same techniques that are available for languages not burdened with these problems.</abstract>
<qualityIndicators>
<score>7.28</score>
<pdfVersion>1.2</pdfVersion>
<pdfPageSize>595 x 842 pts (A4)</pdfPageSize>
<refBibsNative>true</refBibsNative>
<keywordCount>4</keywordCount>
<abstractCharCount>1935</abstractCharCount>
<pdfWordCount>4280</pdfWordCount>
<pdfCharCount>30111</pdfCharCount>
<pdfPageCount>8</pdfPageCount>
<abstractWordCount>281</abstractWordCount>
</qualityIndicators>
<title>Cultural impacts on electronic publishing experience in Serbia</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<volume>100</volume>
<publisherId>
<json:string>nlw</json:string>
</publisherId>
<pages>
<last>179</last>
<first>171</first>
</pages>
<issn>
<json:string>0307-4803</json:string>
</issn>
<issue>4</issue>
<subject>
<json:item>
<value>Library & information science</value>
</json:item>
<json:item>
<value>Librarianship/library management</value>
</json:item>
<json:item>
<value>Library & information services</value>
</json:item>
</subject>
<genre>
<json:string>journal</json:string>
</genre>
<language>
<json:string>unknown</json:string>
</language>
<title>New Library World</title>
<doi>
<json:string>10.1108/nlw</json:string>
</doi>
</host>
<publicationDate>1999</publicationDate>
<copyrightDate>1999</copyrightDate>
<doi>
<json:string>10.1108/03074809910273278</json:string>
</doi>
<id>06F30003BD4B2A843A15B75F4C9D4525B49229BE</id>
<score>0.20896956</score>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/06F30003BD4B2A843A15B75F4C9D4525B49229BE/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/06F30003BD4B2A843A15B75F4C9D4525B49229BE/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/06F30003BD4B2A843A15B75F4C9D4525B49229BE/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Cultural impacts on electronic publishing experience in Serbia</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>MCB UP Ltd</publisher>
<availability>
<p>EMERALD</p>
</availability>
<date>1999</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">Cultural impacts on electronic publishing experience in Serbia</title>
<author>
<persName>
<forename type="first">Du</forename>
<surname>ko Vitas</surname>
</persName>
<affiliation>Assistant Professor at the Computer Science Department of the Faculty of Mathematics, University of Belgrade, Yugoslavia</affiliation>
</author>
<author>
<persName>
<forename type="first">Cvetana</forename>
<surname>Krstev</surname>
</persName>
<affiliation>Assistant Professor at the Library and Information Science Department of the Philological Faculty, both at the University of Belgrade, Yugoslavia</affiliation>
</author>
</analytic>
<monogr>
<title level="j">New Library World</title>
<idno type="pISSN">0307-4803</idno>
<idno type="DOI">10.1108/nlw</idno>
<imprint>
<publisher>MCB UP Ltd</publisher>
<date type="published" when="1999-07-01"></date>
<biblScope unit="volume">100</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="171">171</biblScope>
<biblScope unit="page" to="179">179</biblScope>
</imprint>
</monogr>
<idno type="istex">06F30003BD4B2A843A15B75F4C9D4525B49229BE</idno>
<idno type="DOI">10.1108/03074809910273278</idno>
<idno type="filenameID">0721000404</idno>
<idno type="original-pdf">0721000404.pdf</idno>
<idno type="href">03074809910273278.pdf</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>1999</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Discusses the linguistic influences on an electronic publishing infrastructure in an environment with unstable linguistic standardization from the computational point of view. Essentially, in Serbia in the last half of the century at least publishing is based on the following facts two alphabetic systems are regularly in use with the possibility to mix both alphabets in the same document the various dialects are accepted as a part of a linguistic norm orthography is unstable presently, several linguistic attitudes that have different views of the orthographic norm are under discussion and, in Serbia, many minority languages are in use, which makes it difficult to provide efficient contact between different communities through electronic publishing. In this context, a systematic solution that responds to this complex situation has not been developed in the frame of traditional Serbian linguistics and lexicography in a way that enables the adequate incorporation of the new publishing technologies. Owing to these constraints, the direct application of electronic publishing tools frequently causes the degradation of the linguistic message. In such an environment, the promotion of electronic publishing therefore needs specific solutions. The paper discusses the general frame based on the specifically encoded system of electronic dictionaries that makes electronic texts independent of some of the mentioned constraints. The objective of such a frame is to enable the linguistic normalization of texts at the level of their internal representation, and to establish bridges for communicating with other language societies. Some aspects of electronic text representation that ensures its correct interpretation in different graphical systems and in different dialects are described. This also allows text indexing and retrieval using the same techniques that are available for languages not burdened with these problems.</p>
</abstract>
<textClass>
<keywords scheme="keyword">
<list>
<head>Keywords</head>
<item>
<term>Electronic publishing</term>
</item>
<item>
<term>Language</term>
</item>
<item>
<term>National cultures</term>
</item>
<item>
<term>Serbia</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Emerald Subject Group">
<list>
<label>cat-LISC</label>
<item>
<term>Library & information science</term>
</item>
<label>cat-LLM</label>
<item>
<term>Librarianship/library management</term>
</item>
<label>cat-LISE</label>
<item>
<term>Library & information services</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="1999-07-01">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/06F30003BD4B2A843A15B75F4C9D4525B49229BE/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus emerald not found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:document><!-- Auto generated NISO JATS XML created by Atypon out of MCB DTD source files. Do Not Edit! -->
<article dtd-version="1.0" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">nlw</journal-id>
<journal-id journal-id-type="doi">10.1108/nlw</journal-id>
<journal-title-group>
<journal-title>New Library World</journal-title>
</journal-title-group>
<issn pub-type="ppub">0307-4803</issn>
<publisher>
<publisher-name>MCB UP Ltd</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.1108/03074809910273278</article-id>
<article-id pub-id-type="original-pdf">0721000404.pdf</article-id>
<article-id pub-id-type="filename">0721000404</article-id>
<article-categories>
<subj-group subj-group-type="type-of-publication">
<compound-subject>
<compound-subject-part content-type="code">research-article</compound-subject-part>
<compound-subject-part content-type="label">Research paper</compound-subject-part>
</compound-subject>
</subj-group>
<subj-group subj-group-type="subject">
<compound-subject>
<compound-subject-part content-type="code">cat-LISC</compound-subject-part>
<compound-subject-part content-type="label">Library & information science</compound-subject-part>
</compound-subject>
<subj-group>
<compound-subject>
<compound-subject-part content-type="code">cat-LLM</compound-subject-part>
<compound-subject-part content-type="label">Librarianship/library management</compound-subject-part>
</compound-subject>
</subj-group>
<subj-group>
<compound-subject>
<compound-subject-part content-type="code">cat-LISE</compound-subject-part>
<compound-subject-part content-type="label">Library & information services</compound-subject-part>
</compound-subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Cultural impacts on electronic publishing: experience in Serbia</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<string-name>
<given-names>Du</given-names>
<surname>ko Vitas</surname>
</string-name>
<aff>Assistant Professor at the Computer Science Department of the Faculty of Mathematics, University of Belgrade, Yugoslavia</aff>
</contrib>
<x></x>
<contrib contrib-type="author">
<string-name>
<given-names>Cvetana</given-names>
<surname>Krstev</surname>
</string-name>
<aff>Assistant Professor at the Library and Information Science Department of the Philological Faculty, both at the University of Belgrade, Yugoslavia</aff>
</contrib>
</contrib-group>
<pub-date pub-type="ppub">
<day>01</day>
<month>07</month>
<year>1999</year>
</pub-date>
<volume>100</volume>
<issue>4</issue>
<fpage>171</fpage>
<lpage>179</lpage>
<permissions>
<copyright-statement>© MCB UP Limited</copyright-statement>
<copyright-year>1999</copyright-year>
<license license-type="publisher">
<license-p></license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="03074809910273278.pdf"></self-uri>
<abstract>
<p>Discusses the linguistic influences on an electronic publishing infrastructure in an environment with unstable linguistic standardization from the computational point of view. Essentially, in Serbia in the last half of the century (at least) publishing is based on the following facts: two alphabetic systems are regularly in use with the possibility to mix both alphabets in the same document; the various dialects are accepted as a part of a linguistic norm; orthography is unstable ‐ presently, several linguistic attitudes that have different views of the orthographic norm are under discussion; and, in Serbia, many minority languages are in use, which makes it difficult to provide efficient contact between different communities through electronic publishing. In this context, a systematic solution that responds to this complex situation has not been developed in the frame of traditional Serbian linguistics and lexicography in a way that enables the adequate incorporation of the new publishing technologies. Owing to these constraints, the direct application of electronic publishing tools frequently causes the degradation of the linguistic message. In such an environment, the promotion of electronic publishing therefore needs specific solutions. The paper discusses the general frame based on the specifically encoded system of electronic dictionaries that makes electronic texts independent of some of the mentioned constraints. The objective of such a frame is to enable the linguistic normalization of texts at the level of their internal representation, and to establish bridges for communicating with other language societies. Some aspects of electronic text representation that ensures its correct interpretation in different graphical systems and in different dialects are described. This also allows text indexing and retrieval using the same techniques that are available for languages not burdened with these problems.</p>
</abstract>
<kwd-group>
<kwd>Electronic publishing</kwd>
<x>, </x>
<kwd>Language</kwd>
<x>, </x>
<kwd>National cultures</kwd>
<x>, </x>
<kwd>Serbia</kwd>
</kwd-group>
<custom-meta-group>
<custom-meta>
<meta-name>peer-reviewed</meta-name>
<meta-value>no</meta-value>
</custom-meta>
<custom-meta>
<meta-name>academic-content</meta-name>
<meta-value>yes</meta-value>
</custom-meta>
<custom-meta>
<meta-name>rightslink</meta-name>
<meta-value>included</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec>
<title>Introduction</title>
<p>For the last two decades in Serbia, as well as in former Yugoslavia, all necessary devices for electronic publishing on the technological level have been present. This technology was imported to fulfil the real needs of publishing houses, but the equipment itself proved not to be sufficient to develop the wider environment in which it can produce its best effects. The frequently encountered examples, such as retyping of the same text several times during its production, destruction of electronic texts on a publisher’s sites, the lack of a national standard group corresponding to ISO/IEC JTC1 SC18, and many other factors show that the import of technology is not sufficient to prevent technological underdevelopment. Because of these deficiencies, the paradoxical situation arises that, although the technological base for electronic publishing is well developed, it is often not used to improve an efficient information flow.</p>
<p>Although in the recently adopted strategy for development of information technology in Yugoslavia (July 1997) shows that the importance of electronic publishing is recognized, unfortunately, the projects that are proposed continue the former practice. For instance, this document recommends that the products of electronic publishing are put on the Internet, which is in conformity with global trends, but nothing is said about the necessary infrastructural prerequisites to achieve this goal.</p>
<p>Inspired by these problems, the research group for text processing at the Faculty of Mathematics investigated tools that would, at least at the linguistic level, enable efficient information processing and communication.</p>
</sec>
<sec>
<title>Text as a natural language object</title>
<p>At least one part of most documents is comprised of text in some natural language. This part of a document, either written by hand or in electronic form, is rather the representation of information than the information itself (Birnbaum, 1995). Text in electronic form is represented by a sequence of bytes that can be interpreted in some way. During the last decade, great effort has been made to formally describe the structure of text, namely its logical and graphical layout, as well as to develop comprehensive character codes. The description of these formal aspects of text structure contributes to its better understanding by a human reader although on the level of its internal representation text itself does not contain the information that enables this understanding. (While looking at the visual representation of text one has the impression that text really contains the information that can be read from it.) The understanding of text stems from understanding the language in which it is written (Schwartze, 1985). The portion of a document which comprises natural language text is organized primarily by natural language and interpreted by its linguistic features and not by the graphical or logical layout or some other non‐linguistic characteristics of the text.</p>
<p>Even on the level of international standards from the field of information technology (e.g. ISO/IEC, group JTC1/SC18), electronic text is not seen as an object organized by the rules of some natural language. The lack of this kind of linguistic information can lead, on one side, to the degradation and corruption of text to the point of inability to reconstruct the encoded information it has to convey and, on another side, it disables every automatic transformation of text based on this linguistic knowledge.</p>
<p>In the case of languages for which linguistic standardization is achieved these facts can be hidden to some extent. However, in case of a language system such as Serbo‐Croatian, the lack of this information can cause serious problems in every step of its processing.</p>
</sec>
<sec>
<title>Serbo‐Croatian</title>
<p>We use the term Serbo‐Croatian to cover a linguistic system in a sense described in Popovic´ (1996): Serbo‐Croatian is used as an accepted name for one linguistic base from which several different language standards were derived: Serbian, Croatian, Bosnian. We will confine ourselves in this article to the use of the language in the territory of Serbia as primarily relevant to our work.</p>
<p>The source of this diffuse situation can be found in an orthographic reform dating from the middle of the nineteenth century that introduced a phonetically based orthography. However, cultural and historical conditions did not enable the support of this reform by appropriate language standardization. The consequences were twofold: on a cultural level, this reform produced a rough separation from the former cultural heritage (Selimovic´, 1987) and, on the linguistic level, it enabled the reproduction of many pronunciations in a written message. The latter phenomenon created a situation in which variations in contemporary text resemble the problems encountered today by researchers of old Italian, old French, etc. The same phenomena are present in other contemporary languages with stable standardization but in these cases they are a result of occasional graphical variations (Gross, 1989a) rather than a systemic feature of the orthographic system.</p>
<p>As a result of close connections with different cultures in recent history, two alphabets are in use in Serbia: Latin and Cyrillic. Although Cyrillic is recommended as the official alphabet, in a large number of documents the Latin alphabet is used for various reasons, sometimes political but also practical (such as a lack of appropriate Cyrillic fonts etc.). Consequently, recent attempts, for instance in the frame of ISO TC46, to define only the part of Serbo‐Croatian that uses the Cyrillic alphabet as Serbian were not justifiable. The corpus of daily newspapers published in Serbia shows that in some of them the Cyrillic alphabet prevails while in the others the Latin alphabet prevails. An illustrative example is the regular bulletin of the Yugoslav Standards Organization in which the Latin and Cyrillic alphabet are mixed even in its title:
<italic>J</italic>
<italic>U</italic>
<italic>S</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
{\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
\
<italic>d</italic>
<italic>v</italic>
<italic>r</italic>
<italic>u</italic>
<italic>l</italic>
<italic>e</italic>
<italic>w</italic>
<italic>i</italic>
<italic>d</italic>
<italic>t</italic>
<italic>h</italic>
4
<italic>p</italic>
<italic>t</italic>
<italic>h</italic>
<italic>e</italic>
<italic>i</italic>
<italic>g</italic>
<italic>h</italic>
<italic>t</italic>
5
<italic>p</italic>
<italic>t</italic>
<italic>d</italic>
<italic>e</italic>
<italic>p</italic>
<italic>t</italic>
<italic>h</italic>
0.5
<italic>p</italic>
<italic>t</italic>
</sub>
<italic>I</italic>
<italic>H</italic>
Φ}}
<italic>o</italic>
\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>M</italic>
</sub>
a\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
<italic>I</italic>
\
<italic>h</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−2,
<italic>I</italic>
{\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
\
<italic>d</italic>
<italic>v</italic>
<italic>r</italic>
<italic>u</italic>
<italic>l</italic>
<italic>e</italic>
<italic>w</italic>
<italic>i</italic>
<italic>d</italic>
<italic>t</italic>
<italic>h</italic>
4
<italic>p</italic>
<italic>t</italic>
<italic>h</italic>
<italic>e</italic>
<italic>i</italic>
<italic>g</italic>
<italic>h</italic>
<italic>t</italic>
5
<italic>p</italic>
<italic>t</italic>
<italic>d</italic>
<italic>e</italic>
<italic>p</italic>
<italic>t</italic>
<italic>h</italic>
0.5
<italic>p</italic>
<italic>t</italic>
</sub>
<italic>I</italic>
}
<italic>je</italic>
. It means that, for the purpose of automatic processing, dialects and subdialects used in particular text as well as the alphabet in which they are written are its essential attributes that should be explicitly encoded.</p>
<p>The above two phenomena – reproduction of pronunciation in a written text and the use of two alphabets – have to be taken into consideration before processing a text. The lack of appropriate support that would address these requirements diminishes the possibilities of electronic publishing. For instance, some of the newspapers and journals published in Serbia often use for their Internet presentations the reduced Latin alphabet which consists of 22, instead of 30, letters: that is the English alphabet without w, x, y, z. Diacritics are omitted and some graphemes are substituted by digraphs[1]. Diacritics are, however, distinctive in Serbian as is shown by the case of the following word forms:</p>
<list list-type="order">
<list-item>
<label>1. </label>
<p>
<italic>reci</italic>
, dative singular of
<italic>reka</italic>
(Eng. river) or imperative second person singular of
<italic>rec´i</italic>
(Engl. to say);</p>
</list-item>
<list-item>
<label>2. </label>
<p>
<italic>rec´i</italic>
, infinitive (Eng. to say);</p>
</list-item>
<list-item>
<label>3. </label>
<p>
<italic>reci</italic>
, nominative plural of
<italic>rec</italic>
(Eng. word).</p>
</list-item>
</list>
<p>All of these forms are reduced to only one,
<italic>reci</italic>
, if this reduced Latin alphabet is used.</p>
<p>The problems occurring on this lowest level of text representation are nevertheless complex enough that they cannot be solved by simple string matching methods: even the transliteration between the Cyrillic and Latin alphabet is not unique, and pronunciation variants are standardized neither on the orthographic nor on the morphological level, etc. For instance, if text is transformed from Latin to Cyrillic alphabet only by changing the font and actually preserving all the codes,
<italic>saop tenje</italic>
(Eng. communication) becomes
<italic>cao</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
<italic>I</italic>
  \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
</sub>
{
<sub>
<italic>T</italic>
</sub>
<italic>e</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>H</italic>
</sub>
}\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sub></sub>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
<sub>
<italic>j</italic>
</sub>
}
<italic>e</italic>
instead of
<italic>cao</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
<italic>I</italic>
  \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
</sub>
{
<sub>
<italic>T</italic>
</sub>
<italic>e</italic>
{\
<italic>r</italic>
<italic>m</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>H</italic>
</sub>
}\
<italic>r</italic>
<italic>a</italic>
<italic>i</italic>
<italic>s</italic>
<italic>e</italic>
1.4
<italic>p</italic>
<italic>t</italic>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sub>_</sub>
⊃  
<italic>e</italic>
, and
<italic>cao</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
<italic>I</italic>
   \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
</sub>
{
<sub>
<italic>T</italic>
</sub>
<italic>e</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>H</italic>
</sub>
}\
<italic>r</italic>
<italic>a</italic>
<italic>i</italic>
<italic>s</italic>
<italic>e</italic>
1.4
<italic>p</italic>
<italic>t</italic>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sub>_</sub>
⊃  
<italic>e</italic>
becomes
<italic>saop ten</italic>
\
<italic>u</italic>
<italic>n</italic>
<italic>d</italic>
<italic>e</italic>
<italic>r</italic>
<italic>l</italic>
<italic>i</italic>
<italic>n</italic>
<italic>e</italic>
<italic>w</italic>
<italic>e</italic>
instead of
<italic>saop tenje</italic>
if transformation is done the other way round. But, both
<italic>cao</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
<italic>I</italic>
  \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
<sub></sub>
<italic>a</italic>
<italic>p</italic>
<italic>I</italic>
</sub>
{
<sub>
<italic>T</italic>
</sub>
<italic>e</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>H</italic>
</sub>
}\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sub></sub>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
<sub>
<italic>j</italic>
</sub>
}
<italic>e</italic>
and
<italic>saop te</italic>
\
<italic>u</italic>
<italic>n</italic>
<italic>d</italic>
<italic>e</italic>
<italic>r</italic>
<italic>l</italic>
<italic>i</italic>
<italic>n</italic>
<italic>e</italic>
<italic>w</italic>
<italic>e</italic>
are misspelled in Serbo‐Croatian. Moreover, digraphs can be ambiguous as is shown by the case of the noun
<italic>konjugacija</italic>
(Eng. conjunction) where the group
<italic>nj</italic>
remains in Cyrillic: {\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>K</italic>
<italic>O</italic>
<italic>H</italic>
</sub>
<italic>jy</italic>
{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub> </sub>
Γa{\
<italic>v</italic>
<italic>s</italic>
<italic>k</italic>
<italic>i</italic>
<italic>p</italic>
−1.8
<italic>p</italic>
<italic>t</italic>
{
<sub>
<italic>I</italic>
<italic>I</italic>
 , {
<italic>I</italic>
{\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
\
<italic>d</italic>
<italic>v</italic>
<italic>r</italic>
<italic>u</italic>
<italic>l</italic>
<italic>e</italic>
<italic>w</italic>
<italic>i</italic>
<italic>d</italic>
<italic>t</italic>
<italic>h</italic>
4
<italic>p</italic>
<italic>t</italic>
<italic>h</italic>
<italic>e</italic>
<italic>i</italic>
<italic>g</italic>
<italic>h</italic>
<italic>t</italic>
5
<italic>p</italic>
<italic>t</italic>
<italic>d</italic>
<italic>e</italic>
<italic>p</italic>
<italic>t</italic>
<italic>h</italic>
0.5
<italic>p</italic>
<italic>t</italic>
</sub>
<italic>I</italic>
<italic>j</italic>
a.</p>
<p>On the other side, the efforts of traditional lexicography are mainly concentrated on the production of a Serbo‐Croatian dictionary of literary language and vernacular through a project of the Serbian Academy of Science and Art which is due to be finished and published in paper form by the year 2050. The solution for linguistic problems that arise in the field of electronic publishing is therefore often found in
<italic>ad hoc</italic>
orthographic guidelines. As a consequence of the separation of the Serbian and Croatian languages, the key interest of most orthographers during the last few years in Serbia was, and still is, to stress the differences between these two languages. In spite of many differences between them, most orthographers agree on two points:</p>
<list list-type="order">
<list-item>
<label>(1) </label>
<p>The Serbian language consists of two pronunciations:
<italic>ekavian</italic>
and
<italic>jekavian</italic>
. (
<italic>Jekavian</italic>
pronunciation is also the base for Croatian). Both pronunciations are reproduced in written text. For instance, both the
<italic>ekavian</italic>
form
<italic>rec</italic>
and
<italic>jekavian</italic>
form
<italic>rijerec</italic>
(Eng. word) occur in written form.</p>
</list-item>
<list-item>
<label>(2) </label>
<p>The Serbian language uses both the Cyrillic and Latin alphabet.</p>
</list-item>
</list>
<p>However, there is not full agreement about other language phenomena. As a result of this unstable situation, one can find, even in the documents of the recently established official Council for Language, the following records:</p>
<p>
<italic>preds(j)ednik</italic>
(Eng. president), encompassing both
<italic>predsednik</italic>
(
<italic>ek.</italic>
) and
<italic>predsjednik</italic>
, (
<italic>jek.</italic>
)</p>
<p>
<italic>r(ij)ec</italic>
(Eng. word), encompassing both
<italic>rec</italic>
, (
<italic>ek.</italic>
) and
<italic>rijecˇ</italic>
, (
<italic>jek.</italic>
)</p>
<p>From the formal point of view, this solution introduces parenthesis in the alphabet and substantially complicates the recognition of formal words as a sequence of characters between separators.</p>
<p>The factor that remains neglected in linguistic discussions is the multilingual situation in Serbia. Namely, there are several minority groups in Serbia, most of them located in the north, in the region of Vojvodina. There are approximately 18 minority languages, of which Hungarian is the most important. In some of these languages there is a rather vivid publishing activity in Serbia. Automatic linguistic support for information exchange between these different language systems is practically non‐existent. Also, the influence of other languages, such as English, French, etc., as a result of the state of lexicographic and linguistic theory, and particularly due to unstable terminology, generates additional difficulties in the transfer of information and knowledge.</p>
</sec>
<sec>
<title>Electronic dictionary</title>
<p>One possible solution to the mentioned problems is found in the theoretical frame of the lexicon‐grammar (Gross, 1975) which gives one answer to the problem of distribution of information between grammar and dictionary[2]. One of the consequences of this approach is the precise and systemic encoding of the grammatical information for every lexical entry of the lexicon (Gross, 1989). The specific form of electronic dictionary (which is intended for text processing only and not for human use) developed in the framework of this theoretical model extensively describes the features of lexical units. It is therefore possible to assign to the sequences of characters from the text the lexical unit with the corresponding grammatical information. The basic unit of this model is a simple word defined as a sequence of characters between two separators. The components of the system are:</p>
<list list-type="bullet">
<list-item>
<label></label>
<p>a dictionary of simple lexical units (DELAS) with an accompanying dictionary of the corresponding inflected forms (DELAF);</p>
</list-item>
<list-item>
<label></label>
<p>a dictionary of compounds (DELAC) with an accompanying dictionary of corresponding inflected forms (DELACF);</p>
</list-item>
<list-item>
<label></label>
<p>a system of local grammars that describes the wider text fragments in the form of finite transducers.</p>
</list-item>
</list>
<p>The dictionaries of compounds and the local grammars are used as a device for the elimination of ambiguity (Roche, 1997). These components, or some part of them, are developed for several European languages, namely for French, English, German, Italian, Spanish, Portuguese, Polish, Greek, Bulgarian, and Serbo‐Croatian.</p>
<p>For instance, the DELAF dictionary of simple words for English has the following form:</p>
<list list-type="bullet">
<list-item>
<label></label>
<p>abbreviating,abbreviate.V4:ing</p>
</list-item>
<list-item>
<label></label>
<p>abbrevated,abbreviate.V4:Pp:Pret</p>
</list-item>
<list-item>
<label></label>
<p>abbreviates,abbreviate.V4:Pr3s</p>
</list-item>
<list-item>
<label></label>
<p>abbreviation,.N1:Ns</p>
</list-item>
<list-item>
<label></label>
<p>abbreviations,abbreviation.N1:Np</p>
</list-item>
<list-item>
<label></label>
<p>abbreviator,.N1:Ns</p>
</list-item>
<list-item>
<label></label>
<p>abbreviators,abbreviator.N1:Np</p>
</list-item>
</list>
<p>where, for instance, abbreviating represents the textual word, abbreviate represents the lexical word and V4:ing represents the corresponding grammatical code.</p>
<p>Compounds are defined (Silberztein, 1993) as sequences that include several simple words. However, compounds, that are sometimes called frozen expressions, have to be distinguished from any free sequence of simple words: the fact that makes them different is that the syntactic property of a compound usually cannot be deduced from the syntactic properties of its constituent simple words. Examples of such expressions in English belonging to different parts of speech are: morning glory, make‐believe, a piece of cake. More often than not, the compounds, are written without the characteristic separation sign. On the level of compounds ineffective restrictions in the sintagmatic constructions are also precisely and extensively described.</p>
<p>The system of electronic dictionaries, including the local grammars integrated in the system INTEX, enables the transformation of texts that are based on natural language organization, such as automatic lemmatization which associates to every textual word the corresponding lexical word (for instance, lexical word abbreviation would be associated with the textual word abbreviation). This system is rather a resource for different applications than an application itself. For instance, a spelling checker can be obtained as an excerpt from the electronic dictionary.</p>
<p>Taking this methodological base and format as a starting point, a prototype system of morphological electronic dictionaries is developed for simple words and compounds in Serbo‐Croatian (Vitas, 1993; Krstev, 1997; Nenadic´, 1997). In
<xref ref-type="fig" rid="F_0721000404001">Table I</xref>
the short excerpts are given from the constructed dictionaries[3].</p>
<p>Taking into account the state of affairs in traditional lexicography and problems outlined earlier the construction of the system of electronic dictionaries of Serbo‐Croatian has to take care of the following:</p>
<list list-type="bullet">
<list-item>
<label></label>
<p>It must be independent of the alphabet, that is, for instance the word
<italic>saop tenje</italic>
(Eng. communication) has to have one entry in the electronic dictionary which is independent of its coding in text.</p>
</list-item>
<list-item>
<label></label>
<p>The dictionaries have to synthesize different pronunciations and dialect variations by reducing them to some canonical form. This means that, for instance, words
<italic>recˇ</italic>
, (
<italic>ek.</italic>
) and
<italic>rijecˇ</italic>
, (
<italic>jek.</italic>
) (Eng. word) have to be connected appropriately.</p>
</list-item>
<list-item>
<label></label>
<p>The dictionaries have to neutralize the orthographic variations. For instance, due to the phonologically‐based orthography, variations
<italic>hleb</italic>
and
<italic>leb</italic>
as well as
<italic>hljeb</italic>
and
<italic>ljeb</italic>
(Eng. bread) are possible. Also,
<italic>dan‐i‐noc</italic>
´,
<italic>dan i noc</italic>
´ and
<italic>daninoc´</italic>
(Eng. pansy) are orthographically all correct and should be covered by a dictionary of compounds in the latter case.</p>
</list-item>
</list>
<p>The research has shown that variations due to different pronunciations or orthographic origin do not influence the morphological behaviour of lexical units. For instance, in the mentioned examples
<italic>rec </italic>
, ek./
<italic>rijec </italic>
, jek. (Eng. word) and
<italic>hleb</italic>
, ek./
<italic>leb</italic>
, ek./
<italic>hljeb</italic>
, jek./
<italic>ljeb</italic>
, jek. (Eng. bread) all the variations of the same lexical unit have, on the morphological level, the same inflective and derivational features. These variations influence only the root morpheme.</p>
<p>An extensive description of all these variations in the electronic dictionary would unnecessarily multiply its size. Besides that, the dialect variants would be consistently reproduced at the level of the lexical unit by representing the same lexical unit with several different lexical entries.</p>
<p>One solution may be found in the concept of the lexicographeme as a means of dictionary normalization (Krstev, 1997). The concept will be illustrated with one example. For the lexical unit
<italic>mesec</italic>
(Eng. moon) several variant forms exist according to different pronunciations and dialects:
<italic>mesec</italic>
, ek./
<italic>mjesec</italic>
, jek./
<italic>misec</italic>
, ik./
<italic>mljesec</italic>
, dial.jek. All these forms are recorded in the dictionary (SANU, 1959). The normalized form of this lexical unit could be
<italic>m#esec</italic>
, where
<italic>#e</italic>
represents the lexicographeme having the following properties: in written text it can be realized as one of the following sequences: –
<italic>e, je, i </italic>
– depending on the pronunciation. Furthermore, it can affect the preceding grapheme – that is, palatalize the preceding consonant – in the case of certain dialects.</p>
<p>This concept leads to the development of a system of meta‐dictionaries from which the particular system of dictionaries can be realized which correspond to a certain dialect or orthographic practice. On the level of text processing, this system enables the different forms of text tuning – transformation from one alphabet to another – as well as the conversion from one pronunciation to another, etc.</p>
<p>Bilingual lexicography is often unable to define the precise translation equivalents for the given reasons (Krstev and Vitas, 1998). Thus, the bilingual lexicography, as well as comparative language studies, are burdened with the same problems that aggravate the processing of Serbo‐Croatian from which different defects in multilingual communications can arise. In this way, the normalization of electronic dictionaries can contribute to some extent to the improvements of bilingual lexicography.</p>
<p>These concepts will be illustrated with a few examples that show the improvements of electronic publishing techniques by underlying text with the electronic dictionary.</p>
</sec>
<sec>
<title>Electronic edition of Vuk’s Serbian proverbs</title>
<p>This collection, comprising about 7,000 proverbs, has been assembled by the language reformer Vuk Stefanovic´ Karadz ic´. Its first edition dates from the year 1849. All the later editions reproduce this first edition in all aspects. The inventory of proverbs has not changed either, although there were references to nonexistent proverbs and a lot of identical proverbs differing only in word ordering are listed, etc. Nevertheless, this text is an essential part of any corpus of contemporary Serbian and Croatian.</p>
<p>In 1987 the distinguished Belgrade publisher NOLIT started the project of re‐editing this collection of proverbs that ended in 1996 with the publication of the new paper edition. Besides the removal of some deficiencies of the old edition, the new edition is distinguished by the comprehensive index that contains all the lexical words –nouns, verbs, adjectives and numbers – that occur in proverbs. The presence of numerous variations has, however, encumbered the index significantly: the index, formatted in small script, represents a third of the whole book.</p>
<p>At the same time the preparation of the electronic edition has started at the Faculty of Mathematics, purely as a scientific, non‐commercial project. The electronic edition is based on the encoding scheme proposed by the Text Encoding Initiative (TEI) (Sperberg‐McQueen and Burnard, 1995). Besides that, for the electronic edition, the text of proverbs has been underlined by electronic dictionaries. For instance, the proverbs
<italic>Ja kad vi\
<italic>r</italic>
<italic>a</italic>
<italic>i</italic>
<italic>s</italic>
<italic>e</italic>
3
<italic>p</italic>
<italic>t</italic>
\
<italic>r</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>d</italic>
eh zelen drijen predadoh mu vas moj drijem i lijen</italic>
(Eng. When I saw the green dogwood I gave him over my slumber and my laziness) and
<italic>Ja mu kaz em adumac sam a on pita koliko \
<italic>r</italic>
<italic>a</italic>
<italic>i</italic>
<italic>s</italic>
<italic>e</italic>
3
<italic>p</italic>
<italic>t</italic>
\
<italic>r</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>d</italic>
ece imam</italic>
(Eng. I say to him I am a eunuch and he asks how many children I have) are represented in the following way in the electronic edition:</p>
<list list-type="order">
<list-item>
<label> </label>
<p>
<list list-type="simple">
<list-item>
<p>Ja</w></p>
</list-item>
<list-item>
<p>kad</w></p>
</list-item>
<list-item>
<p>vi&dx;eh</w></p>
</list-item>
<list-item>
<p>zelen</w></p>
</list-item>
<list-item>
<p>drijen</w></p>
</list-item>
<list-item>
<p>predadoh</w></p>
</list-item>
<list-item>
<p>mu</w></p>
</list-item>
<list-item>
<p>vas</w></p>
</list-item>
<list-item>
<p>moj</w></p>
</list-item>
<list-item>
<p>drijem</w></p>
</list-item>
<list-item>
<p>i</w></p>
</list-item>
<list-item>
<p>lijen</w></p>
</list-item>
</list>
<list list-type="order">
<list-item>
<label> </label>
<p></pv></p>
</list-item>
<list-item>
<label>1.2. </label>
<p> </p>
</list-item>
<list-item>
<label>1.3. </label>
<p> </pv> Ja</w></p>
</list-item>
<list-item>
<label>1.4. </label>
<p> mu</w></p>
</list-item>
<list-item>
<label>1.5. </label>
<p> </opt.ph> ka&zx;em</w></p>
</list-item>
</list>
<list list-type="simple">
<list-item>
<p>adumac</w></p>
</list-item>
<list-item>
<p>sam</w></p>
</list-item>
<list-item>
<p>a</w></p>
</list-item>
<list-item>
<p>on</w></p>
</list-item>
<list-item>
<p>pita</w></p>
</list-item>
<list-item>
<p> koliko< /w></p>
</list-item>
<list-item>
<p>ece</w></p>
</list-item>
<list-item>
<p>imam</w></p>
</list-item>
</list>
<list list-type="order">
<list-item>
<label> </label>
<p></pv></p>
</list-item>
<list-item>
<label>1.2. </label>
<p> </divp></p>
</list-item>
</list>
</p>
<p>In this electronic form every textual word in a proverb has been tagged with a SGML tag whose attribute describes its possible lexical words, the particularities of its pronunciation and possible grammatical information. For instance, to textual word
<italic>pita</italic>
two lexical words can be associated:
<italic>pita</italic>
(Eng. pie) and
<italic>pitati</italic>
(Eng. to ask). The textual word   \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sup></sup>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>d</italic>
<italic>ece</italic>
is associated with the lexical word \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sup></sup>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>d</italic>
<italic>eca</italic>
from eastern
<italic>jekavian</italic>
pronunciation whose
<italic>ekavian</italic>
variant is
<italic>deca</italic>
(Eng. children). This form of electronic texts enables many text transformations, such as automatic lemmatization or indexing. Text can also be transformed so that it uses one chosen pronunciation or orthographic norm, as is shown by the transformation of the same two proverbs into the
<italic>ekavian</italic>
pronunciation, which has been done automatically using the information in the
<italic>a</italic>
‐attribute and the appropriate information from the underlying dictionaries:</p>
<p>
<list list-type="order">
<list-item>
<label> </label>
<p><!– East jekavian pronunciation (dialect) –> 1770 Ja kad vi&dx;eh zelen drijen, predadoh mu vas moj drijem i lijen.</p>
</list-item>
<list-item>
<label>1.2. </label>
<p> <!– Ekavian pronunciation –> 1770 Ja kad videh zelen dren, predadoh mu vas moj drem i len.</p>
</list-item>
<list-item>
<label>1.3. </label>
<p> <!– East jekavian pronunciation (dialect) –> 1781 Ja mu ka&zx;em adumac sam, a on pita koliko edx;ece imam.</p>
</list-item>
<list-item>
<label>1.4. </label>
<p> <!– Ekavian pronunciation; contemporary orthography concerning the use of
<italic>h</italic>
–> 1781 Ja mu ka&zx;em hadumac sam, a on pita koliko dece imam.</p>
</list-item>
</list>
</p>
</list-item>
</list>
</sec>
<sec>
<title>Other applications</title>
<p>In addition to the hypertextual connections deduced from the graphical and logical layout of text, it seems natural to enable the running through the text according to the lexical units. If the approximation of such a navigation in English can be accomplished with a kind of find (or search) command, in Slavic languages that are characterized by their reach morphological and derivational system, a special kind of support has to be developed. In order to investigate this possibility an experiment was done using a sample of mathematical textbooks (Nenadic´, 1997). In contrast with the previous example, texts were parsed by the dictionaries of compounds and appropriate local grammars. The investigation is limited to the noun phrases. For instance, the phrase
<italic>niz intervala</italic>
(Eng. sequence of intervals) at the level of simple words can be recognized as a sequence:</p>
<list list-type="order">
<list-item>
<label>(1) </label>
<p>Pre Ns(g);</p>
</list-item>
<list-item>
<label>(2) </label>
<p>Pre Np(g);</p>
</list-item>
<list-item>
<label>(3) </label>
<p>Ns(na) Ns(g); and</p>
</list-item>
<list-item>
<label>(4) </label>
<p>Ns(na) Np(g).</p>
</list-item>
</list>
<p>In this sequence Pre denotes preposition, Ns noun in a singular form and Np noun in a plural form. After underlying the text with the electronic dictionary and applying constraints of agreement, the first two possibilities are rejected as the preposition
<italic>niz</italic>
(Eng. down) requires the noun in accusative.</p>
<p>This first attempt proved that such a text indexing is fruitful, especially in respect to eliminating ambiguity. The goal of further research will be to establish an appropriate semantic network.</p>
<p>In parallel to the electronic dictionary construction the production of the corpus of parallel texts is on its way. It is done in cooperation with the TELRI (Trans‐European Language Resources Infrastructure) project which is funded by the European Commission. TELRI has launched the work on production of a corpus of parallel texts for European languages. As a result, electronic versions of Plato’s
<italic>Republic</italic>
for more than 20 languages, including Serbo‐Croatian, has been produced, fully SGML encoded according to the TEI guidelines. Alignment with English, French and German has been produced. In cooperation with the MULTEXT‐East project, an electronic version of Orwell’s
<italic>1984</italic>
for ten languages, including Serbo‐Croatian has been produced, fully SGML encoded according to CES1 guidelines[4]. Alignment has been done for all the language pairs.</p>
<p>As an example, the passage from Orwell’s
<italic>1984</italic>
: “‘I was passing,’ said Winston vaguely. ‘I just looked in. I don’t want anything in particular”’, which is translated in Serbo‐Croatian as “‘
<italic>Samo sam prolazio’, neodre</italic>
  \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sup></sup>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>d</italic>
  
<italic>eno rec e Vinston, ‘pa sam pogledao. Nisam traz io nis ta posebno</italic>
”’, is presented in the following way in the parallel corpus which has been aligned to the level of the sentence[5].</p>
<list list-type="order">
<list-item>
<label> </label>
<p>*** Link: 1‐2 ***</p>
</list-item>
<list-item>
<label>2. </label>
<p> <> “Samo sam prolazio”, neodre&dx;eno re&cy;e Vinston, “pa sam pogledao.” .EOS</p>
</list-item>
<list-item>
<label>3. </label>
<p> <> “I was passing,” said Winston vaguely. “I just looked in.” .EOS</p>
</list-item>
<list-item>
<label>4. </label>
<p> *** Link: 1‐1 ***</p>
</list-item>
<list-item>
<label>5. </label>
<p> “Nisam tra&zx;io ni&sx;ta naro&cy;ito.” .EOS</p>
</list-item>
<list-item>
<label>6. </label>
<p> “I don’t want anything in particular.” .EOS</p>
</list-item>
<list-item>
<label>7. </label>
<p> .EOP</p>
</list-item>
</list>
<p>On the basis of the experiences obtained during the work on these two projects the production of a parallel corpus has started in which one language will be Serbo‐ Croatian with the intention of aligning it with as many languages as possible, especially with languages with direct contact with Serbo‐Croatian. It is expected that underlying such a parallel corpus with electronic dictionaries will enable, to a certain extent, the automatic establishment of translation equivalents.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>In this article the problems encountered in one unstable linguistic system were illustrated. In the scope of traditional publishing these problems were disguised due to human understanding of language in which text is typeset. Promoting electronic publishing requires the explicit representation of at least a part of this knowledge through support for the processing of linguistic data.</p>
</sec>
<sec>
<title>Notes</title>
<list list-type="order">
<list-item>
<label>1. </label>
<p>1(For instance, the daily newspaper Danas is published in the Latin alphabet and it uses degraded Latin for its WWW presentation (
<ext-link ext-link-type="uri" xlink:href="http://www.danas.co.yu/">http://www.danas.co.yu/</ext-link>
). The daily newspaper Vec ernje Novosti is published in Cyrillic but also uses degraded Latin for its WWW presentation (
<ext-link ext-link-type="uri" xlink:href="http://www.vnovosti.co.yu/">http://www.vnovosti.co.yu/</ext-link>
). On the other side, weekly newspapers Vreme (paper version in Latin) and Ilustrovana Politika (paper version in Cyrillic) both use the Latin alphabet coded in Windows 1250 code page for their WWW presentations (
<ext-link ext-link-type="uri" xlink:href="http://www.danas.co.yu">http://www.danas.co.yu</ext-link>
and
<ext-link ext-link-type="uri" xlink:href="http://www.politika.co.yu/ilustro/,respectively">http://www.politika.co.yu/ilustro/, respectively</ext-link>
). The radio station B92 presents some of its news in the Latin alphabet coded in ISO 8859‐2 and some in the degraded Latin alphabet (
<ext-link ext-link-type="uri" xlink:href="http://www.opennet.org/">http://www.opennet.org/</ext-link>
).</p>
</list-item>
<list-item>
<label>2. </label>
<p>2 An overview of this approach can be found at
<ext-link ext-link-type="uri" xlink:href="http://www.ladl.jussieu.fr/">http://www.ladl.jussieu.fr/</ext-link>
.</p>
</list-item>
<list-item>
<label>3. </label>
<p>3In all Serbo‐Croatian examples diacritics and digraphs will be represented by following SGML entities: &cy; (c ), &cx; (c´), &sx; (s ), &zx; (z ), &dx; ( \
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<sup></sup>
\
<italic>c</italic>
<italic>l</italic>
<italic>a</italic>
<italic>p</italic>
<italic>d</italic>
 ), &lx; (lj), &nx; (nj), and &dy; (dz ). It does not necessarily mean that the same representation was used in a real application.</p>
</list-item>
<list-item>
<label>4. </label>
<p>4 This project is described at
<ext-link ext-link-type="uri" xlink:href="http://www.cs.vassar.edu/CES/CES1.html">http://www.cs.vassar. edu/CES/CES1.html</ext-link>
</p>
</list-item>
<list-item>
<label>5. </label>
<p>5Both the corpora and supporting software have been produced on CD‐ROM whose description can be found at
<ext-link ext-link-type="uri" xlink:href="http://www.telri.de/">http://www.telri.de/</ext-link>
.</p>
</list-item>
</list>
</sec>
<sec>
<fig position="float" id="F_0721000404001">
<label>
<bold>Table I
<x> </x>
</bold>
</label>
<caption>
<p>Short excerpts from the constructed dictionaries</p>
</caption>
<graphic xlink:href="0721000404001.tif"></graphic>
</fig>
</sec>
</body>
<back>
<ref-list>
<title>References and further reading</title>
<ref id="B1">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Birnbaum</surname>
,
<given-names>D.J</given-names>
</string-name>
</person-group>
. (
<year>1995</year>
), “
<article-title>
<italic>Informational and presentational units in early Cyrillic writing</italic>
</article-title>
”,
<source>
<italic>Proceedings of the First International Conference on Computer Processing of Medieval Slavic Manuscripts</italic>
</source>
,
<publisher-loc>Blagoevgrad</publisher-loc>
, 24‐28 July, pp.
<fpage>41</fpage>
<x></x>
<lpage>9</lpage>
.</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Gross</surname>
,
<given-names>M</given-names>
</string-name>
</person-group>
. (
<year>1975</year>
),
<source>
<italic>Méthodes en Syntaxe</italic>
</source>
,
<publisher-name>Hermann</publisher-name>
,
<publisher-loc>Paris</publisher-loc>
.</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Gross</surname>
,
<given-names>M</given-names>
</string-name>
</person-group>
. (
<year>1989</year>
), “
<article-title>
<italic>La construction de dictionnaires électroniques</italic>
</article-title>
”,
<source>
<italic>Annales des Télécommunications</italic>
</source>
, Vol.
<volume>44</volume>
Nos
<issue>1‐2</issue>
, pp.
<fpage>4</fpage>
<x></x>
<lpage>19</lpage>
.</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Gross</surname>
,
<given-names>M</given-names>
</string-name>
</person-group>
. (
<year>1989a</year>
), “
<article-title>
<italic>The use of finite automata in the lexical representation of natural languages</italic>
</article-title>
”, in
<person-group person-group-type="editor">
<string-name>
<surname>Gross</surname>
,
<given-names>M</given-names>
</string-name>
</person-group>
. and
<person-group person-group-type="editor">
<string-name>
<surname>Perrin</surname>
,
<given-names>D</given-names>
</string-name>
</person-group>
. (Eds),
<source>
<italic>Electronic Dictionaries and Automata in Computational Linguistics</italic>
</source>
,
<article-title>
<italic>Lecture Notes in Computer Science</italic>
</article-title>
, No.
<issue>377</issue>
,
<publisher-name>Springer‐Verlag</publisher-name>
,
<publisher-loc>Berlin</publisher-loc>
, pp.
<fpage>34</fpage>
<x></x>
<lpage>50</lpage>
.</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Krstev</surname>
,
<given-names>C</given-names>
</string-name>
</person-group>
. (
<year>1997</year>
), “
<article-title>
<italic>One approach to text modeling and transformation</italic>
</article-title>
”, (in Serbo‐Croatian), PhD thesis,
<publisher-loc>Faculty of Mathematics, University of Belgrade</publisher-loc>
.</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Krstev</surname>
,
<given-names>C</given-names>
</string-name>
</person-group>
. and
<person-group person-group-type="author">
<string-name>
<surname>Vitas</surname>
,
<given-names>D.</given-names>
</string-name>
</person-group>
(
<year>1998</year>
), “
<article-title>
<italic>Morphological normalization of translation equivalents</italic>
</article-title>
”,
<source>
<italic>Third European TELRI Seminar: “Translation Equivalence – Theory and Practice”</italic>
</source>
, Montecatini Terme, 16‐18 October 1997, Institut für deutsche Sprache, Mannheim and Tuscany Word Center, Montecatini Terme, pp.
<fpage>117</fpage>
<x></x>
<lpage>23</lpage>
.</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Krstev</surname>
,
<given-names>C</given-names>
</string-name>
</person-group>
.,
<person-group person-group-type="author">
<string-name>
<surname>Pavlovic´‐Laz etic´</surname>
,
<given-names>G.</given-names>
</string-name>
</person-group>
and
<person-group person-group-type="author">
<string-name>
<surname>Vitas</surname>
,
<given-names>D</given-names>
</string-name>
</person-group>
. (
<year>1997</year>
), “
<article-title>
<italic>Neutralization of variations in a dictionary entry’s structure in Serbo‐Croatian</italic>
</article-title>
”, in
<person-group person-group-type="editor">
<string-name>
<surname>Junghanns</surname>
,
<given-names>U.</given-names>
</string-name>
</person-group>
and
<person-group person-group-type="editor">
<string-name>
<surname>Zybatow</surname>
,
<given-names>G.</given-names>
</string-name>
</person-group>
(Eds),
<source>
<italic>Formal Slavistik</italic>
</source>
,
<publisher-name>Vervuert Verlag</publisher-name>
,
<publisher-loc>Frankfurt am Main</publisher-loc>
, pp.
<fpage>417</fpage>
<x></x>
<lpage>25</lpage>
.</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Nenadic´</surname>
,
<given-names>G</given-names>
</string-name>
</person-group>
. (
<year>1997</year>
), “
<article-title>
<italic>Algorithms for recognition of compounds in mathematical text and its applications</italic>
</article-title>
”, (in Serbo‐Croatian), Masters Thesis,
<publisher-loc>Faculty of Mathematics, University of Belgrade.</publisher-loc>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Popovic´</surname>
,
<given-names>L.J</given-names>
</string-name>
</person-group>
. (
<year>1996</year>
), “Deux approches idéologiques de la vernacularisation de la langue littéraire chez les Serbs à la fin du 18e et dans la première moitié du 19e siècle”,
<source>
<italic>Langues et Nation en Europe Centrale et Orientale du 19e Siècle à Nos Jours</italic>
</source>
,
<publisher-name>Cahiers de l’ILSL</publisher-name>
,
<publisher-loc>Lausanne</publisher-loc>
, No.
<issue>8</issue>
, pp.
<fpage>209</fpage>
<x></x>
<lpage>40</lpage>
.</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Roche</surname>
,
<given-names>E</given-names>
</string-name>
</person-group>
. and
<person-group person-group-type="author">
<string-name>
<surname>Schabes</surname>
,
<given-names>Y</given-names>
</string-name>
</person-group>
. (Ed.) (
<year>1997</year>
),
<source>
<italic>Finite‐state Language Processing, A Bradford Book</italic>
</source>
,
<publisher-name>The MIT Press</publisher-name>
,
<publisher-loc>Cambridge, MA and London</publisher-loc>
.</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation>
<person-group person-group-type="author">
<string-name>SANU </string-name>
</person-group>
(
<year>1959‐1990</year>
),
<source>
<italic>Rec nik Srpskohrvatskog Knjiz evnog i Narodnog Jezika, Vol. 1‐14 (A‐N)</italic>
</source>
,
<publisher-loc>Srpska akademija nauka i umetnosti i Institut za srpskohrvatski jezik, Beograd.</publisher-loc>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Schwartze</surname>
,
<given-names>C</given-names>
</string-name>
</person-group>
. (
<year>1985</year>
), “
<article-title>
<italic>Text understanding and lexical knowledge</italic>
</article-title>
”,
<publisher-loc>lecture at the International Pragmatics Conference, Viareggio.</publisher-loc>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Selimovic´</surname>
,
<given-names>M.</given-names>
</string-name>
</person-group>
(
<year>1987</year>
),
<source>
<italic>Za i protiv Vuka</italic>
</source>
,
<publisher-name>BIGZ</publisher-name>
,
<publisher-loc>Beograd</publisher-loc>
.</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Silberztein</surname>
,
<given-names>M.</given-names>
</string-name>
</person-group>
(
<year>1993</year>
),
<source>
<italic>Dictionnaires Electroniques et Analyse Automatique de Textes: Le System INTEX</italic>
</source>
,
<publisher-name>Masson</publisher-name>
,
<publisher-loc>Paris.</publisher-loc>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Sperberg‐McQueen</surname>
,
<given-names>C.M</given-names>
</string-name>
</person-group>
. and
<person-group person-group-type="author">
<string-name>
<surname>Burnard</surname>
,
<given-names>L</given-names>
</string-name>
</person-group>
. (Eds) (
<year>1995</year>
),
<source>
<italic>Guidelines for Electronic Text Encoding and Interchange</italic>
</source>
,
<publisher-loc>Text Encoding Initiative, Chicago‐Oxford</publisher-loc>
.</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Vitas</surname>
,
<given-names>D.</given-names>
</string-name>
</person-group>
(
<year>1993</year>
), “
<article-title>
<italic>Mathematical model of Serbo‐Croatian morphology (nominal inflection)</italic>
</article-title>
”, (in Serbo‐Croatian), PhD thesis,
<publisher-loc>Faculty of Mathematics, University of Belgrade.</publisher-loc>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation>
<person-group person-group-type="author">
<string-name>
<surname>Vitas</surname>
,
<given-names>D</given-names>
</string-name>
</person-group>
. and
<person-group person-group-type="author">
<string-name>
<surname>Krstev</surname>
,
<given-names>C</given-names>
</string-name>
</person-group>
. (
<year>1996</year>
), “
<article-title>
<italic>Tuning the text with electronic dictionary</italic>
</article-title>
”,
<source>
<italic>Papers in Computational Lexicography</italic>
</source>
,
<publisher-loc>COMPLEX’96, Budapest</publisher-loc>
, pp.
<fpage>267</fpage>
<x></x>
<lpage>76</lpage>
.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Cultural impacts on electronic publishing experience in Serbia</title>
</titleInfo>
<titleInfo type="alternative" lang="en" contentType="CDATA">
<title>Cultural impacts on electronic publishing experience in Serbia</title>
</titleInfo>
<name type="personal">
<namePart type="given">Du</namePart>
<namePart type="family">ko Vitas</namePart>
<affiliation>Assistant Professor at the Computer Science Department of the Faculty of Mathematics, University of Belgrade, Yugoslavia</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Cvetana</namePart>
<namePart type="family">Krstev</namePart>
<affiliation>Assistant Professor at the Library and Information Science Department of the Philological Faculty, both at the University of Belgrade, Yugoslavia</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="research-article"></genre>
<originInfo>
<publisher>MCB UP Ltd</publisher>
<dateIssued encoding="w3cdtf">1999-07-01</dateIssued>
<copyrightDate encoding="w3cdtf">1999</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Discusses the linguistic influences on an electronic publishing infrastructure in an environment with unstable linguistic standardization from the computational point of view. Essentially, in Serbia in the last half of the century at least publishing is based on the following facts two alphabetic systems are regularly in use with the possibility to mix both alphabets in the same document the various dialects are accepted as a part of a linguistic norm orthography is unstable presently, several linguistic attitudes that have different views of the orthographic norm are under discussion and, in Serbia, many minority languages are in use, which makes it difficult to provide efficient contact between different communities through electronic publishing. In this context, a systematic solution that responds to this complex situation has not been developed in the frame of traditional Serbian linguistics and lexicography in a way that enables the adequate incorporation of the new publishing technologies. Owing to these constraints, the direct application of electronic publishing tools frequently causes the degradation of the linguistic message. In such an environment, the promotion of electronic publishing therefore needs specific solutions. The paper discusses the general frame based on the specifically encoded system of electronic dictionaries that makes electronic texts independent of some of the mentioned constraints. The objective of such a frame is to enable the linguistic normalization of texts at the level of their internal representation, and to establish bridges for communicating with other language societies. Some aspects of electronic text representation that ensures its correct interpretation in different graphical systems and in different dialects are described. This also allows text indexing and retrieval using the same techniques that are available for languages not burdened with these problems.</abstract>
<subject>
<genre>keywords</genre>
<topic>Electronic publishing</topic>
<topic>Language</topic>
<topic>National cultures</topic>
<topic>Serbia</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>New Library World</title>
</titleInfo>
<genre type="journal">journal</genre>
<subject>
<genre>Emerald Subject Group</genre>
<topic authority="SubjectCodesPrimary" authorityURI="cat-LISC">Library & information science</topic>
<topic authority="SubjectCodesSecondary" authorityURI="cat-LLM">Librarianship/library management</topic>
<topic authority="SubjectCodesSecondary" authorityURI="cat-LISE">Library & information services</topic>
</subject>
<identifier type="ISSN">0307-4803</identifier>
<identifier type="PublisherID">nlw</identifier>
<identifier type="DOI">10.1108/nlw</identifier>
<part>
<date>1999</date>
<detail type="volume">
<caption>vol.</caption>
<number>100</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>4</number>
</detail>
<extent unit="pages">
<start>171</start>
<end>179</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">06F30003BD4B2A843A15B75F4C9D4525B49229BE</identifier>
<identifier type="DOI">10.1108/03074809910273278</identifier>
<identifier type="filenameID">0721000404</identifier>
<identifier type="original-pdf">0721000404.pdf</identifier>
<identifier type="href">03074809910273278.pdf</identifier>
<accessCondition type="use and reproduction" contentType="copyright">© MCB UP Limited</accessCondition>
<recordInfo>
<recordContentSource>EMERALD</recordContentSource>
</recordInfo>
</mods>
</metadata>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000226 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000226 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:06F30003BD4B2A843A15B75F4C9D4525B49229BE
   |texte=   Cultural impacts on electronic publishing experience in Serbia
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024