I'm having trouble understanding whether it is possible to prove that two sets (in this case regular languages) are identical and thus interchangeable. From what I understand, sets can be equivalent even if they are not constructively equal.

Regular languages are sets of strings, but I don't see how to say that r1 = r2 so that something like symmetry can be used in a proof.

Here is my RegularLanguage declaration:

```
Inductive RegularLanguage (A : Set) : Set :=
| EmptyLang : RegularLanguage A
| EmptyStr : RegularLanguage A
| Unit : A -> RegularLanguage A
| Union :
RegularLanguage A
-> RegularLanguage A
-> RegularLanguage A
| Concat :
RegularLanguage A
-> RegularLanguage A
-> RegularLanguage A
| KleeneStar : RegularLanguage A -> RegularLanguage A
.
```

Even though I say the type is Set, this doesn't create a set of strings, as far as I can see. Do I need some function of type `RegularLanguage -> Set of strings`

?

Thank you for the help.

# Best How To :

There are a few misunderstandings about some Coq concepts which I'll try to clarify.

First, in Coq, you shouldn't view `Set`

as what we call "set" in traditional mathematics. Instead, you should view it as a *type*. Coq also has `Type`

, but for the purposes of this post you can view both `Set`

and `Type`

as being interchangeable. To avoid confusion, from now on, I will use "set" to refer to the usual concept of set in traditional mathematics, and "type" to refer to elements of `Set`

and `Type`

in Coq.

So, what exactly is the difference between a set and a type? Well, in normal mathematics, it makes sense to ask oneself whether *anything* is a member of *any* given set. Thus, if we were to develop the theory of regular expressions in normal mathematics, and each regular expression were seen as a set, it would make sense to ask questions such as `0 \in EmptyLang`

, because, even though `0`

is a natural number, it could be the element of any set *a priori*. As a less contrived example, the empty string is both a member of the full language (i.e. the one that contains all strings) and of the Kleene closure of any base language.

In Coq, on the other hand, each valid element of the language has exactly *one* type. The empty string, for instance, has type `list A`

for some `A`

, which is written `[] : list A`

. If we try to ask whether `[]`

belongs to some regular language using the "has type" syntax, we get an error; try typing e.g.

```
Check ([] : EmptyLang).
```

Both sets and types can be seen as collections of elements, but types are in a sense more restrictive: for instance, one can take the intersection of two sets, but there's no way of taking the intersection of two types.

Second, when you write

```
Inductive RegularLanguage (A : Set) : Set := (* ... *)
```

this does *not* mean that the elements that you list below the header define sets *nor* types. What this means is that, for each type `A`

(the `(A : Set)`

part), you are defining a new type noted `RegularLanguage A`

(the `RegularLanguage (A : Set) : Set`

part), whose elements are *freely generated* by the list of constructors given. Thus, we have

```
EmptyLang : RegularLanguage nat
RegularLanguage nat : Set
```

but we *don't* have

```
EmptyLang : Set
```

(once again, you can try typing all the above judgments into Coq to see which are accepted and which aren't).

Being "freely generated" means, in particular, that the constructors you listed are *injective* and *disjoint*. As larsr noted previously, in particular, it is not the case that `Union l1 l2 = Union l2 l1`

; as a matter of fact, you will usually be able to prove `Union l1 l2 <> Union l2 l1`

. The problem is that there is a mismatch between the notion of equality that you get for inductively defined types in Coq (which you can't change) and your intended notion of equality for regular languages.

While there are a few ways around this, I think the easiest one is to use the setoid rewrite feature. This would involve first defining a function or predicate (e.g., as larsr suggested, a boolean function `regexp_match : RegularLanguage A -> list A -> bool`

) to determine when a regular language contains some string. You can then define an equivalence relation on languages:

```
Definition regexp_equiv A (l1 l2 : RegularLanguage A) : Prop :=
forall s, regexp_match l1 s = regexp_match l2 s.
```

and use setoid rewrite to rewrite with this equivalence relation. There is a small caveat, however, which is that you can only rewrite with an equivalence relation in contexts that are compatible with this equivalence relation, and you need to explicitly prove lemmas to do so. You can find more details in the reference manual.