In Hack, collection objects were one of the biggest early mistakes that the took a huge amount of effort to undo. It turns out that the copy-on-write semantics of PHP array were extremely important for performance and good APIs. Being able to pass arrays to things without fear of mutation allowed for tons of optimizations and not needing to copy things just in case. This is why Hack switched to using `dict`, `vec`, and `keyset` rather than collection objects.
More generally, it's weird to see a whole blog post about generics for PHP not even mentioning Hack's generics designs. A lot of thought and iteration went into this like 5-10 years ago.
Example, Java is using erased generics. Once the code is compiled, the generics information is no longer in the bytecode. List<String> becomes List<>. This is called type erasure.
C# is using reified generics where this information is preserved. List<String> is still List<String> after compilation
Incidentally if you do what they're proposing for PHP in Java (where you define a non-generic subclass of a generic type), the actual generic type parameters actually are in the bytecode, and depending on the static type you use to reference it, may or may not be enforced...
public class StringList extends java.util.ArrayList<String> {
public static void main(String[] args) throws Exception {
StringList asStringList = new StringList();
java.util.ArrayList<Integer> asArrayList = (java.util.ArrayList<Integer>)(Object)asStringList;
System.out.println("It knows it's an ArrayList<String>: " + java.util.Arrays.toString(((java.lang.reflect.ParameterizedType)asArrayList.getClass().getGenericSuperclass()).getActualTypeArguments()));
System.out.println("But you can save and store Integers in it:");
asArrayList.add(42);
System.out.println(asArrayList.get(0));
System.out.println(asArrayList.get(0).getClass());
System.out.println("Unless it's static type is StringArrayList:");
System.out.println(asStringList.get(0));
}
}
That prints out:
It knows it's an ArrayList<String>: [class java.lang.String]
But you can save and store Integers in it:
42
class java.lang.Integer
Unless it's static type is StringArrayList:
Exception in thread "main" java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.String (java.lang.Integer and java.lang.String are in module java.base of loader 'bootstrap')
at StringList.main(StringList.java:11)
Academics invent short names for common (in their field) concepts not because they're 'sweaty' but because if the thing you're going to mention in every second paragraph in a good chunk of the communication you do with other people working on the same topic requires a full sentence to explain you're going to A. get really annoyed at having to type it out all the time and B. probably explain it slightly differently every time and confuse people.
Academic jargon isn't invented to be elitist, it's invented to improve communication.
(of course there's a good chance you understand this already, and you're just making a dumb joke, but I figured I'd explain this anyway for the benefit of everyone reading)
I don't take issue with the naming but with the names that feel a bit beyond my ken. "Erased" makes sense when explained but not before. "Reified" is a word I simply do not use so it feels like academia run amok.
Regardless, I recognize myself as the point of failure, but those names do strike me as academia speak, though better than some/many. <shrug>
Another shrug, but part of it is that the PL community (programming language community) is pretty deep into its own jargon that doesn’t have as much overlap as you might think, with other subfields of computer science.
People describe a type system as “not well-founded” or “unsound” and those are specific jabs at the axioms, and people talk about “system F” or “type erasure” or “reification”. Polymorphism can be “ad-hoc” or “parametric”, and type parameters can be invariant, covariant, and contravariant. It’s just a lot of jargon and I think the main reason it’s not intuitive to people outside the right fields is that the actual concepts are mostly unfamiliar.
> Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.
To be more precise: in Java, generics on class/method/field declarations are available at runtime via reflection. The issue is that they aren’t available for instances. So a java.util.ArrayList<java.lang.String> instance is indistinguishable at runtime from a java.util.ArrayList<java.lang.Object> instance
I may be missing something about how the PHP compiler/interpreter works, but I don't quite understand why this is apparently feasible to implement:
class BlogPostRepository extends BaseRepository<BlogPost> { ... }
$repo = new BlogPostRepository();
but the following would be very hard:
$repo = new Repository<BlogPost>();
They write that the latter would need runtime support, instead of only compile time support. But why couldn't the latter be (compile time) syntactic sugar for the former, so to speak?
(As long as you don't allow the generic parameter to be dynamic / unknown at compile time, of course.)
The former merely exposes a `BlogPostRepository` class. The latter requires some mechanism for creating a generic object of concrete type, which is a lot bigger change to the implementation. Does each parametrized generic type have its own implementation? Or does each object have sufficient RTTI to dynamically dispatch? And what are the implications for module API data structures? Etc. In other words, this limitation avoids tremendously disruptive implementation impacts. Not pretty, but we're talking PHP here anyway. ;-)
I wish we had typed arrays. Totally not gonna happen, theres been RFCs but I have enough boilerplate classes that are like
Class Option
Class Options implements Iterator, countable, etc.
Options[0], Options[1], Options[2]
or Options->getOption('some.option.something');
A lot of wrapper stuff like that is semi tedious, the implementation can vary wildly.
Also because a lot of times in php you start with a generic array and decide you need structure around it so you implement a class, then you need an array of class,
Not to mention a bunch of WSDLs that autogenerate ArrayOfString classes...
This is the core problem with PHP for me.I love PHP and use it every day. Part of that is the strength and versatility of the arrays implementation (i.e. hashmap). However, the problem is always the fact that an array cant be typed.
IF they could just introduce that, it would solve 80% of user-land issues over night.
In Hack, collection objects were one of the biggest early mistakes that the took a huge amount of effort to undo. It turns out that the copy-on-write semantics of PHP array were extremely important for performance and good APIs. Being able to pass arrays to things without fear of mutation allowed for tons of optimizations and not needing to copy things just in case. This is why Hack switched to using `dict`, `vec`, and `keyset` rather than collection objects.
More generally, it's weird to see a whole blog post about generics for PHP not even mentioning Hack's generics designs. A lot of thought and iteration went into this like 5-10 years ago.
See https://docs.hhvm.com/hack/arrays-and-collections/object-col... and https://docs.hhvm.com/hack/arrays-and-collections/vec-keyset...
Can someone smarter than me explain what they mean by "reified generics", "erased generics", and a use case for when to use one over the other?
Example, Java is using erased generics. Once the code is compiled, the generics information is no longer in the bytecode. List<String> becomes List<>. This is called type erasure.
C# is using reified generics where this information is preserved. List<String> is still List<String> after compilation
And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.
This is very important both for cache locality and for minimizing garbage collector pressure.
And Java has been working on Project Valhalla for ~20 years to retrofit the ability to do this to the existing Java language...
Reified Generics doesn't seem to be a goal mentioned on the project website- Am I missing something?
https://openjdk.org/projects/valhalla/
There is an interesting article which mentions reification, but that's all I could locate.
How We Got the Generics We Have (Or, how I learned to stop worrying and love erasure)
https://openjdk.org/projects/valhalla/design-notes/in-defens...
Incidentally if you do what they're proposing for PHP in Java (where you define a non-generic subclass of a generic type), the actual generic type parameters actually are in the bytecode, and depending on the static type you use to reference it, may or may not be enforced...
That prints out:I'm not smarter than you but.
I believe the terms reified generics and erased generics is the type sweaty donkey ball terminology you get for professional CS academics.
Sticking my neck out further.
Reified generics means the type is available at run time. In C# you can write if(obj.GetType() == typeof(typename))
Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.
Academics invent short names for common (in their field) concepts not because they're 'sweaty' but because if the thing you're going to mention in every second paragraph in a good chunk of the communication you do with other people working on the same topic requires a full sentence to explain you're going to A. get really annoyed at having to type it out all the time and B. probably explain it slightly differently every time and confuse people.
Academic jargon isn't invented to be elitist, it's invented to improve communication.
(of course there's a good chance you understand this already, and you're just making a dumb joke, but I figured I'd explain this anyway for the benefit of everyone reading)
I don't take issue with the naming but with the names that feel a bit beyond my ken. "Erased" makes sense when explained but not before. "Reified" is a word I simply do not use so it feels like academia run amok.
Regardless, I recognize myself as the point of failure, but those names do strike me as academia speak, though better than some/many. <shrug>
Another shrug, but part of it is that the PL community (programming language community) is pretty deep into its own jargon that doesn’t have as much overlap as you might think, with other subfields of computer science.
People describe a type system as “not well-founded” or “unsound” and those are specific jabs at the axioms, and people talk about “system F” or “type erasure” or “reification”. Polymorphism can be “ad-hoc” or “parametric”, and type parameters can be invariant, covariant, and contravariant. It’s just a lot of jargon and I think the main reason it’s not intuitive to people outside the right fields is that the actual concepts are mostly unfamiliar.
> Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.
To be more precise: in Java, generics on class/method/field declarations are available at runtime via reflection. The issue is that they aren’t available for instances. So a java.util.ArrayList<java.lang.String> instance is indistinguishable at runtime from a java.util.ArrayList<java.lang.Object> instance
I may be missing something about how the PHP compiler/interpreter works, but I don't quite understand why this is apparently feasible to implement:
but the following would be very hard: They write that the latter would need runtime support, instead of only compile time support. But why couldn't the latter be (compile time) syntactic sugar for the former, so to speak?(As long as you don't allow the generic parameter to be dynamic / unknown at compile time, of course.)
The former merely exposes a `BlogPostRepository` class. The latter requires some mechanism for creating a generic object of concrete type, which is a lot bigger change to the implementation. Does each parametrized generic type have its own implementation? Or does each object have sufficient RTTI to dynamically dispatch? And what are the implications for module API data structures? Etc. In other words, this limitation avoids tremendously disruptive implementation impacts. Not pretty, but we're talking PHP here anyway. ;-)
In the sense of an affirmative vote, the proper word is "yea."
write PHP a lot. every day.
I wish we had typed arrays. Totally not gonna happen, theres been RFCs but I have enough boilerplate classes that are like
Class Option Class Options implements Iterator, countable, etc.
Options[0], Options[1], Options[2]
or Options->getOption('some.option.something');
A lot of wrapper stuff like that is semi tedious, the implementation can vary wildly.
Also because a lot of times in php you start with a generic array and decide you need structure around it so you implement a class, then you need an array of class,
Not to mention a bunch of WSDLs that autogenerate ArrayOfString classes...
Nailed it.
This is the core problem with PHP for me.I love PHP and use it every day. Part of that is the strength and versatility of the arrays implementation (i.e. hashmap). However, the problem is always the fact that an array cant be typed.
IF they could just introduce that, it would solve 80% of user-land issues over night.