Description

INFORMATIQUE THÉORIQUE ET APPLICATIONS A. EHRENFEUCHT G. ROZENBERG Each regular code is included in a maximal regular code Informatique théorique et applications, tome 20, n o 1 (1986), p

Information

Category:
## Automobiles

Publish on:

Views: 44 | Pages: 9

Extension: PDF | Download: 0

Share

Transcript

INFORMATIQUE THÉORIQUE ET APPLICATIONS A. EHRENFEUCHT G. ROZENBERG Each regular code is included in a maximal regular code Informatique théorique et applications, tome 20, n o 1 (1986), p http://www.numdam.org/item?id=ita_ _1_89_0 AFCET, 1986, tous droits réservés. L accès aux archives de la revue «Informatique théorique et applications» implique l accord avec les conditions générales d utilisation (http://www.numdam. org/legal.php). Toute utilisation commerciale ou impression systématique est constitutive d une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright. Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques Informatique théorique et Applications/Théoretical Informaties and Applications (vol. 20, n 1, 1985, p. 89 à 96) rairma$407 EACH REGULAR CODE IS INCLUDED IN A MAXIMAL REGULAR CODE (*) by A. EHRENFEUCHT (*) and G. ROZENBERG ( 2 ) Abstract. It is proved that each regular code is included in a maximal regular code. A corollary ofthis resuit seules an open question from [R]. Résumé. On prouve que tout code rationnel est contenu dans un code rationnel maximal. Un corollaire de ce résultat répond à une question ouverte posée dans [R]. INTRODUCTION A language C E + is called a code if C* is a free submonoid of E* with base C. The theory of codes initiated by M. Schutzenberger [Sch] forms an interesting fragment of formai language theory. A code C^X + is called maximal if, for any xes* C, C[J{x} is not a code. Ail codes are subsets of maximal codes and the investigation of maximal codes forms an active research area within the theory of codes (see, e. g., [BPS], [PI], [R] and [SM]). In particular one is often interested in the problem of the following kind: given a code C of type X (e. g. finite or regular) is it possible to find a maximal code D of type X such that C D? It was shown in [R] that for finite codes this question gets a négative answer. Since then the following question remained open: is every finite code included in a maximal regular code? Obviously any finite (resp. regular) prefix code is included in a finite (resp. regular) maximal prefix code. Recently it was shown in [P2] that every finite biprefix code is included in a maximal biprefix regular code. In this paper we pro vide a positive answer to the above question. As a matter of fact we prove a more gênerai resuit (theorem 5): each regular code is included in a regular maximal code. We would like to emphasize the (*) Received May 1984, revised April (*) Department of Computer Science, University of Colorado, Boulder, Colorado ( 2 ) Institute of Applied Mathematics and Computer Science, University of Leiden, Leiden, The Netherlands. Informatique théorique et Applications/Théoretical Informaties and Applications ISSN en cours 86/01 898/$ 2.80/ Gauthier-Villars. 90 A. EHRENFEUCHT, G. ROZENBERG following: the new resuit presented in this paper is theorem 5; most of the other results is in one form or the other (and perhaps in a different terminology) retrievable from the literature. However we have decided to make this paper rather self-contained and to provide all the needed results with their (sometimes different from the literature) proofs carried out in a uniform manner . We assume the reader to be familiar with basic formai language theory in particular with rudimentary theory of regular languages (see, e. g., [S]). PRELIMINAIRES We use mostly Standard language theoretic notation and terminology. For a set A, #A dénotes the cardinality of A. For sets A, B, A B dénotes the set theoretic différence of A and B. For a word x, \ x dénotes its length and first (x) dénotes the first letter of x; if x = x 1 jx 2 then y is called a subword of x (also referred to as a segment or a factor of x). The set of all subwords of x is denoted by sub(x) and for a language K, sxxb(k) U sub(x). xek A nonempty word x is called bordered if x=yzy for a nonempty word y; otherwise x is called unbordered. A language C^X + is called a code if every word yec + satisfies the following condition: if y = u 1...u n and y = x 1...x m for n, m^l and u l9..., M, x u..., x m ec then n = m and u { = x t for 1 ^ i n. (In other words, y has a unique représentation in C; subwords u 1?..., u n of this représentation are referred to as C- blocks of y). A code C e E + is called maximal if, for each xel*-c, CUW is not a code. In the sequel of this paper we consider an arbitrary but fixed alphabet E where a= # 1; all languages we will consider are over S. For a language K and a positive integer n, L n (iq = {wek: w = n} and We will define now and recall a number of notions concerning languages they will be central to our paper. (1) K is dense if xesub(k*) for each xes*. Informatique théorique et Applications/Theoretical Informaties and Applications MAXIMAL REGULAR CODE 91 (2) K is fast if there exists a positive integer n such that for each w e sub (K*) there exist i, j;el* such that xy ^ n and xwyek*. (3) X is rich if there exists a positive integer e such that oc m (K*) ^ a m /e for infinitely many positive integer s m. RESULTS In this section we investigate the problem how various properties of a code (such as: fast, dense, rich, regular and maximal) influence each other. Once this relationship is explored we can settle the problem of completing a regular code to a regular maximal code. Our first result is known (see [SM]). However for the sake of completeness we provide its proof (which is different from the proof in [SM]). THEOREM 1: Each maximal code is dense, Proof: First we prove the following result. CLAIM 1: Let C be a code that is not dense. There exists an unbordered word vv c such that w c sub(c*). Proof of Claim 1; Since C is not dense, there exists a word z sub(c*). Let bel, be such that b # first(z) and let w c = zfc z. Clearly w c is unbordered. Moreover w c sub (C*), because z sub(c*). Thus claim 1 holds. Now we prove theorem 1 as follows. Let C be a maximal code. Assume to the contrary that C is not dense. Then let w c be an unbordered word satisfying the statement of claim 1. Consider D = CU{w c }. Let y be an arbitrary word in D +. Since w c is unbordered, y has a unique représentation of the form y = x o w c x 1 w c...w c x n, where n^o (that is if y = u o w c u 1 w c...w c u m where m^o then m = n and u Xi for 1 ^ ï n). Since C is a code and w c sub(c*), y has a unique représentation in D. Thus D is a code. Since C ^ D and w c^sub(c*) we get a contradiction (to the fact that C is maximal). Consequently C must be dense and theorem 1 holds. THEOREM 2: Each rich code is maximal Proof: Let C be a rich code and let e be a positive integer constant satisfying the définition of richness for C. vol. 20, n 1, 1986 92 A. EHRENFEUCHT, G. ROZENBERG Assume to the contrary that C is not maximal Let z be a word such that B = C{J{z} is a code; let \z\ = u Let k be a positive integer. Let n l9.,., n k bea séquence of positive integers such that: a * n 1 n 2 ... n k and ot B.(C*) ^. (1) (Since C is rich and e satisfies the définition of richness of C, such a séquence exists.) Consider r = n 1 + n n fc + /ct. Clearly: OL r (B*) a'. (2) On the other hand let us consider an arbitrary permutation i u..., i k of the set {1,..., *}. Let j^el^c*),...,^el^(c») and let Y (ii,..., i k ) =yi l zy i2 z...y ik z. Since B is a code, if (j u..., j fc ) is a permutation of {1,..., fc} different from (i l9..., j k ), then y(i l9..., i k )ïy(ju --Jù- Consequently from (1) it follows that: From (2) and (3) it follows that: a 1 a 2 a fc... kl *,(B*). (3) e e e Since ea' is a constant (independent of /c), there exists a positive integer k 0 such that, for ail s /c 0, s! (ea f ) s. Consequently (4) yields a contradiction (k was chosen to be an arbitrary positive integer). Thus C must be maximal and theorem 2 holds. THEOREM 3: Each regular code is fast. Proof: Obvious. THEOREM 4: Each dense and fast code is rich, Proof: Let C be a code that is dense and fast. Then there exists a finite set F of ordered pairs of words from S* such that for each wel* there exists (x, y) e F such that xwy e C*. Let q = max { xy : (x, y) e F}, ƒ = # F and (4) CLAIM 2: For eac/i positive integer n there exists a positive integer m ^ such Informatique théorique et Applications/Theoretical Informaties and Applications MAXIMAL REGULAR CODE 93 Proof of claim 2: Let for each wel,*, pair (w) be a fixed element (x, y) of F such that xwyec*. Let n be a positive integer. Let: E (n, x, y) = {we L (S*) : pair (w) = (x, ƒ)}. Clearly for some (x o,j; o )ef, # E(n, x 0, y 0 )^ o n /f. Let p = \x o y o \. Hence: Then Thus if we choose m = n+p we get m ^ n + ? and claim 2 holds. Now theorem 4 follows directly from claim 2. REMARK: Theorems 2 and 4 together are more gênerai than theorem 7.4 (due to Schutzenberger) from [E]. However, it is pointed out by D. Perrin in [P3] that a proof of the gênerai case can be retrieved from the proof of theorem 9.3 in [E]. THEOREM 5: Let C be a regular code, There exists a code D which is dense, fast, regular and such that C ^ D. Proof: Let C be a regular code. We consider separately two cases. (i) C is dense. Then the theorem follows from theorem 3 (take D = C). (ii) C is not dense. Then, by claim 1, there exists an unbordered word xv c such that Let: A = {w c x 1 w c x 2... w c x w c : n ^ 1, x C* and and let D = C U {w c } U A. CLAIM 3: D is a code. Proof of Claim 3: Let yed +. Since w c is unbordered, y has a unique représentation of the form y = x x w c x 2 w c...w c x (that is we can uniquely distinguish all occurrences of w c in y). This représentation provides the basis for the division of y into Z)-blocks which is obtained as follows: (1) A subword w c x J -w c x J w c x j+l w c constitutes a D-block (corresponding to A) if 2^jgn-l, ;\+/ ^ n 1, x p..., x j+l $C* and Xj_ u x j+l + l C*; such a D-block is referred to as an A-block. 94 A. EHRENFEUCHT, G. ROZENBERG (2) Ail occurrences of w c not involved in A-blocks are also D-blocks. (3) Ail Xi s which are not involved in ^4-blocks must be in C* and so they are uniquely divisible in D-blocks (really C-biocks). The définition of A and the fact that w c sub(c*) and w c is unbordered guarantee that such a division is unique. Hence D is a code and claim 3 holds. CLAIM 4: D is dense. Proof of claim 4; Let we E*. Consider y = w c uw c. Reasoning as in the proof of claim 3 we get a (unique) représentation of y in D +. Thus D is dense and claim 4 holds. CLAIM 5: D is regular. Proof: Obvious. CLAIM 6: D is fast. Proof: This follows from claim 5 and theorem 3. M Now theorem 5 follows from claims 3 through 5. Our results yield two interesting corollaries. The first one solves an open problem from the theory of codes (see, e. g., [R] and [P2]). As a matter of fact it provides a more gênerai resuit: Restivo has asked ([R]) whether an arbitrary fînite code can be completed to a maximal regular code we show that even an arbitrary regular code can be completed to a maximal regular code. COROLLARY 1: Let C be a code. If C is regular, then there exists a code D such that C ç D, D is maximal and D is regular, Proof: Let C be a regular code. By theorem 5 there exists a regular code D such that C ^ D, D is fast and dense. Thus, by theorem 4 5 D is rich and so, by theorem 2, D is maximal. Hence corollary 1 holds. Secondly, we notice that theorems 1 through 4 provide an alternative proof of the theorem by Schutzenberger (see [E], p. 94). COROLLARY 2: Let C be a regular code. Then C is maximal if and only if C is dense. Proof: It follows directly from theorems 1 through 4. Informatique théorique et Applications/Theoretical Informaties and Applications MAXIMAL REGULAR CODE 95 DISCUSSION We have established a number of relationships between dense, fast, rich, maximal and regular codes. Using these relationships we were able to demonstrate that each regular code is included in a maximal regular code. In particular we have demonstrated that each rich code is maximal and each maximal code is dense. Hence each rich code is dense. We provide now a direct proof of this result we believe it sheds a different light on this relationship. COROLLARY 3: Each rich code is dense. Proof: Let C be a rich code. Assume that C is not dense. Hence there exists a word z sub(c*); let z = t. Let n be an arbitrary positive integer; n can be represented in the form n = k 1 t + k 2 for some k ^ 0 and k 2 t. An arbitrary word from L n (C + ) can be (starting from the left end) divided into k 1 consécutive subwords of length t leaving a suffix of lengthfe 2.Thus: Consequently: oc (C + ) (a'-l)*ia*2 (a ( -l) fc i Hence: which contradicts the fact that C is rich. Consequently C must be dense and the result holds. To put some of the dependencies we have demonstrated in a better perspective we provide now the following result. THEOREM 6: There exists a maximal code which is not rich. Proof: Consider the family of all full binary trees in which leafs are labelled by a and all inner nodes are labelled by b. Consider now all postfix notations for these trees in this way we get the language P ç {a, b} +. It is well known that P is a code (every forest of full binary trees has a unique représentation in the postfix notation). Consider an arbitrary word ze{a, b} + P. Clearly a M + 1 zep + (we parse a' z ' + i z from right to left assigning +1 to a and 1 to b\ then each subword yielding by summation weight +1 is a tree corresponding to an element of vol. 20, n 1, 1986 96 A. EHRENFEUCHT, G. ROZENBERG P). Hence P U {2} is not a code, because a' 2! + * z would have two different représentations in P +. Thus F is a maximal code. On the other hand it is known (see, e. g., [F], ch. III, sect. 3) that: (Hère one considers random walks on the line of positive integers where a represents a step up and b represents a step down . It turns out that the probability of starting in 0 and not returning to 1 in up to n steps equals 1 in the limit.) Hence P is not rich and the theorem holds. Perhaps the most significant open question in the area of extending codes to their maximal counterparts is (see[p2]): can every biprefix regular code be extended to a maximal biprefix regular code? An answer to this question will certainly make the picture of the whole area clearer. ACKNOWLEDGMENTS The authors gratefully acknowledge the support of NSF grant MCS The authors are indebted to J. Karhumaki and D. Perrin for their comments on the first version of this paper. REFERENCES [BPS] J. BERSTEL, D. PERRIN and M. P. SCHUTZENBERGER, The Theory of Codes (to appear). [E] S. EILENBERG, Automata, Languages and Machines, Vol. A, 1974, Academie Press, New York and London. [F] W. FELLER, An Introduction to Probability Theory and its Applications, Vol. 1, 1950, J. Wiley. [PI] D. PERRIN Ed., Theorie des codes, L.I.T.P. Publication, Paris, [P2] D. PERRIN, Completing Biprefix Codes, Lecture Notes in Computer Science, Vol. 140, 1982, pp [P3] D. PERRIN, Séries formelles et combinatoire du monoide libre, in J. Berstel Ed., Series Formelles, L.I.T.R, Paris, [R] A. RESTIVO, On Codes Having no Finite Complétions, in Automata, Languages and Programming, S. MICHAELSON Ed., Edinburgh University Press, 1976, pp [S] A. SALOMAA, Formai Languages, Academie Press, London, New York, [Sch] M. P. SCHUTZENBERGER, Une théorie algébrique du codage, Séminaire Dubreuil- Pisot, année 55-56, exp. No. 15, Inst. Henri-Poincaré, Paris, [SM] M. P. SCHUTZENBERGER and R. S. MARCUS, Full Decodable Code Word Sets, LR.E. Trans on Inf. Theory, Vol 5, 1959, pp Informatique théorique et Applications/Theoretical Informaties and Applications

Related Search

Similar documents

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks